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P450 OXYGENASES AND METHODS OF USE 

REFERENCE TO RELATED APPUC ATIONS 

This application claims the benefit of U.S. Provisional Application No. 60/489,597, filed 
5 July 22, 2003, which application is incorporated herein in its entirety. 

STATEMENT OF GOVERP^MENT SUPPORT 
This invention was made with United States government support pursuant to grant 
no. CA SS2S4, firom the National Institutes of Healtfi. The United States government has certain 
10 rights in the invention. 

FIELD OF THE DISCLOSURE 
The present disclosure relates to P4S0 o^^genases, particularly to taxoid Sa-hydroxylases, 
and the nucleic acids diat encode them, and to methods of using such oxygenase nucleic acids and 
IS enzymes, for example, to produce Taxol™ (commonly known as paclitaxel) and other paclitaxel 
intermediates. 

BACKGROUND OF THE DISCLOSURE 
The conq;>lex diterpenoid Taxol™ CBristol-Myers Squibb; common name paclitaxel) (Wani 

20 et al., J. Am. Chem. Soc. 93:2325-2327, 1971) is a potent antimitotic agent with excellent activity 
against a wide range of cancers, including ovarian and breast cancer (Aibuck and Blaylock, Taxol: 
Science and Applications, CRC Press, Boca Raton, 397-415, 1995; Holmes et al, ACS Symposium 
Series 583:31-57, 1995). Paclitaxel was isolated originally fixrni the bark of the Pacific yew {Taxus 
brevifolia). For a number of years, paclitaxel was obtained exclusively firom yew bark, but low yields 

25 of this coni^und firom the natural source coi^led to the destructive nature of the harvest, pronq)ted 
new methods of paclitaxel production to be developed. 

Total chemical syntheses of paclitaxel have been achieved (for review, see, Kingston et al.y 
Prog. Chem. Org. Nat Prod. 84:56-225, 2002) but the yields of fiie drug by this method are too low 
to be practical. Paclitaxel curtenfiy is produced primarily by chemical semisynthesis firom advanced 

30 taxane metabolites CEIolton et al., Taxol: Science and Applications, CRC Press, Boca Raton, 97-121, 
1995; Hezari and Croteau, Planta Medico, 63:291-295, 1997) that are isolated fix>m fiie needles (a 
renewable resource) of various Taxus speciea. However, at least because of die increasmg demand 
for this drug both for use earlier in the course of cancer intervention and for new therapeutic 
applications (Goldspiel, Pharmacotherapy 17:110S-125S, 1997), high-yield, cost-effective methods 

35 of paclitaxel production continue to be needed. Some have proposed isolating paclitaxel firom 

alternative biological sources, such as the endophytic fimgi, Taxomyces andreanae (Stierle et al., J. 
Nat Prod. 58:1315-1324, 1995), or firom Taxus cell cultures (Ketchum et aly Biotechnol Bioeng, 
62:97-105, 1999). However, fiiese methods are also too inefficient to produce sufficient quantities of 
the dmg and have had limited commercial success. 
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Iinpioviiig the production yield of paelitaxel firom any biological system, whetber intact 
organisms (such as, Taxus plants or paclitaxel-producing fiii^ or cell cultures, would be fiicilitated 
by a detailed understanding of the paelitaxel biosyntiietic pathway. The paelitaxel biosyntfaetic 
pathway is conqplex and believed to involve nearly 20 distinct steps (Floss and Mocek, Taxol: 

5 Science and Applications^ CRC Press, Boca Raton, 191-208, 1995; and Croteau et al.^ Curr, Top. 
Plant Physiol 15:94-104, 1996). However, relatively few of die enzymatic reactions and 
intermediates of this con^licated pathway have been defined in detail. 

The first committed enzyme of the paelitaxel pathway is believed to be taxadiene synthase 
(Koepp et al^ J, Biol Chem. 270:8686-8690, 1995), which cyclizes the common precursor 

10 geranylgeranyl diphosphate (Hefiier et a/.. Arch. Biochem. Biophys. 360:62-74, 1998) to taxadiene 
(FIG. 1). The cyclized intermediate (/.e., taxa-4(S),l l(12)-diene) subsequently undergoes 
modification involving at least eigiht oxygenation stq)S, a formal dehydrogenation, an epoxide 
reanangement to an oxetane, and several acylations (Floss and Mocek, Taxol: Science and 
Applications, CRC Press, Boca Raton, 191-208, 1995; and Croteau et a/., Curr. Top. Plant Physiol 

IS 15:94-104, 1996). Taxadiene synthase has been isolated firom T brevifolia and characterized (Hezari 
et al. Arch, Biochem. Biophys. 322:437-444, 1995), the mechanism of action defined (Lin et a/.. 
Biochemistry 35:2968-2977, 1996), and the corresponding cDNA clone isolated and expressed 
(Wildung and Croteau, J. Biol Chem. 271:9201-9204, 1996). 

The second specific step of paelitaxel biosynthesis is believed to be an oxygenation 

20 (hydroxylation) reaction that introduces a hydroxy! group to position 5 of taxa-4(5),l l(12)-diene to 
produce taxa-4(20),ll(12)-dien-Sa-ol. Using a crude Taxus microsome preparation, Hefiier et al 
(Methods Ensymol 272:243-250, 1996) demonstrated a microsomal activity that catalyzed die 
stereospedfic hydroxylation of taxa-4(5),l l(12)-diene to taxa-4(20),l l(12)-dien-5a-ol (with 
double-bond reanangement) (Hefiier et al, Chem. Biol, 3:479-489, 1996). This microsomal activity 

25 was attnbuted to one or more cytochrome P450 oxygenases (Hefiier et al. Chemistry and Biology 

3:479-489, 1996). Cytochrome P450 oxygenases are enzymes that have a unique sulfiir atom ligated 
to the heme iron and that, when reduced, form carbon monoxide (CO) conq)lexes. When conq)lexed 
to carbon monoxide, cytochrome P450 proteins display a major absorption peak (Soret band) near 
450 nm. 

30 Taxus microsomal preparations were fiuther shown to catalyze the hydroxylation of 

taxadiene or taxadien-5a-ol to die level of a pentaol (Hefiier et al. Methods Enzymol 272:243-250, 
1996; Lovy Wheeler et al. Arch. Biochem. Biophys., 390:265-278, 2001). These results suggested 
diat the paelitaxel biosynthetic pathway included at least five distinct cytochrome P450 taxoid 
oxygenases in the early parts of the padiway (Hezari et al, Planta Med. 63:291-295, 1997). Later 

35 steps of the paelitaxel biosynthetic pathway are thought to include at least three additional 

oxygenation steps (CI and C7 hydroxylations and an epoxidation at C4-C20). These steps also are 
believed to be catalyzed by cytochrome P450 enzymes, but these reactions reside too far down the 
pathway to observe in microsomes by current e;q)erimental methods (Croteau et al, Curr, Topics 
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Plant Physiol. 15:94-104, 1995; Hezari ei al. Planta Med. 63:291-295, 1997 Levy Wheeler et al.. 
Arch. Biochem. Biophys.^ 390:265-278, 2001). Since Taxus (yew) plants and cells do not appear to 
accumulate taxoid metabolites bearing fewer than six oxygen atoms (e.; hexaol or epmypentaol) 
(Kiqgston et a/.. Prog. Chem. Org. Nat. Prod. 61:1-206, 1993), such intermediates must be nq>idly 
5 tiansformed^down the pathway, indicating that the oxygenations (hydroxylations) are relatively slow 
pathway steps. 

Taxus microsome preparations contain hundreds of difTercnt proteins, including an estimated 
30 to 50 similar cytochrome P4S0 oxygenases (He&er et al.^ Methods EnzymoL 272:243-250, 1996). 
Biochemical purification of cytochrome P450 enzymes from Taxus microsomes (Hefner et ah^ 

10 Methods EnzymoL 272:243-250, 1996) is not practical, at least, because the numerous P450 

cytochrome oxygenases present in this cell fraction have very similar physical properties (Mihaliak et 
al^ Methods Plant Biochem. 9:261-279, 1993). Widi no usefiil biochemical means to distinguish 
among die many microsomal P450 oxygenases, it is not feasible to sufficiently purify any one 
enzyme to obtain even short pq>tide sequences. As a result oflier methods are needed to isolate and 

15 characterize these important enzymes at the molecular level. 

Differential display reverse transcription PGR (DD-RT PGR) has been used to isolate methyl 
jasmonate-induced nucleic acids encoding taxoid oxygenases of the paclitaxel biosynthetic pathway 
(see, for exan^le, PCT Piib. No. WOOl/34780). Several of the encoded oxygenase enzymes have 
been e7q)ressed and fimctionally characterized (PCX Pub. No. WOOl/34780; Schoendorf a/., Proc. 

20 Nad. Acad. Sci. USA, 98:1501-1506, 2001; Jennewein et a/., Proc. NatL Acad ScL USA, 

98:13595-13600, 2001; Jennewein et al, Aich. Biochem. Biophys., 413:262-270, 2003). However, 
transcripts encoding taxoid oxygenases that are not, or only weakly, induced are likely to be missed 
by file DD-RT PGR technique. 

Paclitaxel is an important drug that is not efBciently produced using current methods. 

25 Genetic engineering and recombinant technologies ofTer ways to increase paclitaxel and taxoid 
yields. To capitalize on these technologies, there is a continuing need to identify and isolate the 
genes encoding the enzymes of die paclitaxel biosynthetic pathway, including, for exan^le, the 
numerous oxygenase enzymes, and for methods of using such genes and enzymes to produce 
paclitaxel and its intermediates. 

30 

SUMMARY OF THE DISCLOSURE 
This disclosure provides a novel P450 oxygenase, which is capable of incorporating oxygen 
(for example, a hydroxyl group or epoxide ring) into a substrate, such as a taxoid. In some exan:q>les, 
the disclosed oxygenase incorporates oxygen at the C5 position of a taxoid, wherein the disclosed 
35 oxygenase is referred to as a taxoid 5-hydroxylase. In more specific examples, the oxygen is 

incorporated in the alpha configuration at C5 of a taxoid, wherein the disclosed oxygenase is referred 
to as a taxoid 5of-hydroxylase. In some examples, a taxoid substrate for a disclosed oxygenase 
includes a taxadiene, such as taxa-4(5),l l(12)-diene and taxa-4(20),l l(12)-diene. 
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Encompassed within this disclosuie are the protein and nucleic acid sequences of the 
disclosed P4S0 oxygenase. Also provided aie'nucleotide and amino acid sequence variants, 
oligonucleotides and protein fiagments. This disclosure demonstrates that the disclosed enzymes 
catalyze the oxygenation of taxoids» for exanqile, at the C5 position. Evidence provided herein also 

5 dernonstrates the relaxed substrate specificity ofthe disclosed oxygenases. It is also disclosed that a ' 
nucleic acid encoding a disclosed oxygenase^ such as a So-hydroxylase cDNA, can be operatively 
linked to a promoter and ceUs can be transfected with the recombinant polynucleotide. 

Also provided herein are methods for using a disclosed oxygenase, such as a taxoid 
Sa-hydroxylase. Such methods include, without limitation, methods of using a taxoid 

10 Sa-hydroxylase to hydroxylate a taxoid substrate or to produce (or increase the yield of) paclitaxel or 
paclitaxel intermediates. Exanq>les of these methods include introducing a taxoid Sa-hydroxylase 
recombinant polynucleotide into a cell, such as a Taxus cell, or contacting a taxoid with a 
Sa-hydroxylase polypeptide or fimctional fiagment theieofi Also disclosed are taxoid 
Sa-hydroxylase»specific binding agents. 

IS The foregoing and other features and advantages will become more apparent from the 

following detailed description of several embodiments, which proceeds with reference to tiie 
accompaiq^ing figures. 

BRIEF DESCRIPTION OF THE FIGURES 

20 FIG. 1 shows an outline of the early steps of paclitaxel biosynthesis. Paclitaxel (1) 

formation involves the cyclization of geranylgeranyl diphosphate (2) to taxa-4(5),l l(12)-diene (3) 
and cytochrome P4S0-mediated hydroxylation to taxa-4(20),l l(12)-dien-Sa-ol (4). The multiple 
arrows between taxa-4(20),l l(12>-dien-Sa-ol (4) and paclitaxel (1) are represented of numerous 
additional enzymatic steps in die paclitaxel biosynthetic pathway. 

2S FIG. 2 shows an alignment of die deduced amino acid sequences of selected taxoid 

hydroxylases. The sequences of taxoid lOP-hydroxylase (TIOH; SEQ ID NO: 16), taxoid 
13a-hydroxylase (T13H; SEQ ID NO: 18) and the clone SI taxadiene Sa-hydroxylase (TSH; SEQ ID 
NO: 2) are compared. Black boxes indicate identical residues for the three sequences; grey boxes 
indicate identical residues for two ofthe three. 

30 FIG. 3 shows substrate binding spectra of taxadiene isomers. Microsomes from 

S.frugiperda cells enriched witii 200 pmol recombinant taxoid 5a-hydroxylase (clone SI) were 
employed. Taxa-4(20),1 l(12)-diene was assayed over a concentration range from 0.1 to 10 pM, and 
a binding constant (Ks) of 4 ± 1 fiM was determined (upper graph). Taxa-4(5),ll(12)-diene was 
assayed over a concentration range of 0.1 to 20 and a binding constant (Ks) of 6.S db l.S \jM was 

3S determined (lower graph). 

FIG. 4 shows a kinetic evaluation of taxadiene isomers. Taxa-4(20),1 l(12)-diene (O) and 
taxa-4(S),l l(12)-diene (•) were evaluated with microsomes from S, frugiperda cells enriched witibi 
SO pmol recombinant taxoid Sa-hydroxylase (A), and with microsomes from T. media suspension 
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celte containing about SO pmol of total native cytochrome P4S0 (B). Substrate concentration rai^e 
was varied fiom 1 to 500 in all cases. Taxa-4(20)»1 l(12)-diene yielded an average Km value of 
21.5 with Vrel of 135, and taxa-4(5Xl 1,12-diene yielded an average Km value of 36 |iM witti 
Vie! of 100. 

5 FIG. 5 shows a proposed, but not binding, mechanism for cytochrome P450 taxoid 

5a-hydroxylase. This cytochrome P450-mediated conversion of taxa-4(5),l l(12>diene (3) and 
taxa-4(20),l l(12)-diene (5) to taxa-4(20Xl l(12)-dien-5a-ol (4) is believed to mvolve hydrogen 
abstraction fix>m C20 (in 3) or C5 (in 5) to provide a common allylic radical intermediate, followed 
by oxygen insertion at the Sa*face to yield taxadien-Sa-ol (4). Isomerization of 3 to 5 was not 

10 observed, nor does the route via epoxide 6 with rearrangement seem likely. 

FIGS. 6A and 6B collectively show an alignment of the deduced amino acid sequences of 
selected taxoid oxygenases. The sequence of clone SI taxoid 5a-hydroxylase (T5H; SEQ ID NO: 2) 
is compared to the sequences of eight other taxoid oxygenases isolated from Taxus cuspidata. Each 
of the illustrated oxygenases is known to have a positive CO difference spectrum and to oxidize 

15 intermediates in the paclitaxel biosynOetic pattiway or derivatives diereof (see, eg., PCT Pub. 
No. WOOl/23586, Table 2). Oxygenase sequences other ttian T5H are designated as in PCT Pub. 
No. WOOl/23586 (herem, *T clones"). F31 is a taxoid Tj^hydroxylase (SEQ ID NO: 8); F72 is a 
taxoid 14/^hydroxylase (SEQ ID NO: 12), F14 is a taxoid lOP-hydroxylase (SEQ ID NO: 16), and 
F16 is a taxoid 13a-hydroxylase (SEQ ID NO: 18). F14 and F16 correspond to TIOH and T13H, 

20 respectively, in FIG. 2. TTie oflier illustrated F clones are F12 (SEQ ID NO: 4), F21 (SEQ ID NO: 6), 
F51 (SEQ ID NO: 10), and F9 (SEQ ID NO: 14). Shaded amino acid residues are identical among all 
sequences; ":" indicates conservative substitutions among the amino acid residues at that position. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTING 
25 The nucleic and amino acid sequences listed in die accon^anying sequence listing are 

shown using standard letter abbreviations for nucleotide bases, and duree letter code for amino acids, 
as defined in 37 CJP.R. §1.822. Only one strand of each nucleic acid sequence is shown, but die 
conq>lementary strand is understood as included by any reference to the displayed strand. In die 
accompanying sequence listing: 
30 SEQ ED NO: 1 shows a nucleic acid sequence encoding a taxoid 5a-hydroxylase (OenBank 

Accession No. AY289209) and its corresponding amino acid sequence. 

SEQ ID NO: 2 showss an amino acid sequence of a taxoid 5a-hydroxylase (GenBank 
Accession No. AAQ56240.1), which is encoded by the nucleic acid sequence in SEQ ID NO: 1. 
SEQ ID NO: 3 shows die nucleic acid sequence of Clone F12 described in PCT Pub. 
35 No. WOOl/23586. 

SEQ ID NO: 4 shows the taxoid oxygenase amino acid sequence encoded by Clone F12. 
SEQ ID NO: 5 shows die nucleic acid sequence of Qone F21 described m PCT Pub. 
No. WOOl/23586. 
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SEQ ID NO: 6 shows the taxoid oxygenase amino acid sequence encoded by Clone F21 . 

SEQ ID NO: 7 shows a nucleic acid sequence encoding a taxoid 7/3-hydroxyIase (see» also, 
clone F31 as described in PCT Pub. No. WO01/23S86) (GenBank Accession No. AY3079S1). 

SEQ ID NO: 8 shows a taxoid 7/3-hydioxylase amino acid sequence (OenBahk Accession 
S No. AAQ7SS3), which is encoded by die nucleic acid sequence in SEQ ID NO: 7. 

SEQ ID NO: 9 shows fhe nucleic acid sequence of Clone FSl described in PCT Pub. 
No. WOOl/23586. 

I SEQ ID NO: 10 shows the taxoid oxygenase amino acid sequence encoded by Clone FS 1 . 

SEQ ID NO: 11 shows a nucleic acid sequence encoding a taxoid 14/3^hydroxylase (see» 
10 also, clone F72 as described in PCT Pub. No. WOOl/23586) (GenBank Accession No. AY188177; 
Jennewein et ai.^ Arch. Biochenu Biophys., 413(2):262-270, 2003). 

SEQ ID NO: 12 shows a taxoid 14iS-hydroxylase amino acid sequence (OenBank Accession 
No. AA066199), vAnch is encoded by the nucleic acid sequence in SEQ ID NO: 1 1 . 

SEQ ID NO: 13 diows the nucleic add sequence of Clone F9 described in PCT Pub. 
15 No. WOOl/23586. 

SEQ ID NO: 14 shows the taxoid oxygenase amino acid sequence encoded by Clone F9. 
SEQ ID NO: 15 shows a nucleic acid sequence encoding a taxoid 10/3-hydroxylase (see, 
also, clone F14 as described in PCT Pub. No. WOOl/23586) (GenBank Accession No. AY563635; 
Jennewein a/., Proc. Nad. Acad. ScL USA, 101(24):9149-9154, 2004). 
20 SEQ ID NO: 16 shows a taxoid lOj^hydroxylase amino acid sequence (GenBank Accession 

No. AAT47183) encoded by the nucleic acid sequence in SEQ ID NO: 15. 

SEQ ID NO: 17 shows a nucleic acid sequence encoding a taxoid 13€^hydroxylase (see, 
also, clone F16 as described in PCT Pub. No. WOOl/23586) (GenBank Accession No. AY056019; 
Jennewein a/., Froc. Nad. Acad. Sci. USA, 98(24):13595-13600, 2001). 
25 SEQ ID NO: IS shows a taxoid 13of-hydroxylase amino acid sequence (GenBank Accession 

No. AAL23619) encoded by the nucleic acid sequence in SEQ ID NO: 17. 

SEQ ID NO: 19 shows a nucleic acid sequence encoding a taxadiene synthase (GenBank 
Accession No. U48796; Wildung and Croteau, J. Biol Chem., 271(16):9201-9204, 1996) and its 
corresponding axnino acid sequence. 
30 SEQ ED NO: 20 shows the amino acid sequence of a taxadiene synthase (GenBank 

Accession No. AAC493 10), which is encoded by flie nucleic acid sequence in SEQ ID NO: 19. 

SEQ ID NO: 21 shows a nucleic acid sequence encoding a taxadienol acetyl txansfeiase 
(also caUed, TAT or TAXI) (GenBank Accession No. AF190130; Walker et al. Arch, Biochem. 
Biophys., 374(2):371-380, 2000) and its corresponding amino acid sequence. 
35 SEQ ID NO: 22 shows the amino acid sequence of a taxadienol acetyl transferase 

(GenBank Accession No. AAF34254), which is encoded by the nucleic acid sequence in SEQ ID 
NO: 21. 

SEQ ID NO: 23 shows a nucleic acid sequence encoding a 2-debenzoyl- 
7,13-diacetylbaccatin in-2-0-ben2oyl transferase (also called, TAX2) (GenBank Accession 
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No. AF297618; Walker and Croteau, Proc. Natl Acad. 5ci. USA, 97(25):13S91-13S96, 2000)) and its 
conresponding amino acid sequence. 

SEQ ID NO; 24 shows the amino acid sequence of a 2-debenzoyl-7»13-d]acetylbaccatin 
ni-2-O-benzoyl transferase (GenBank Accession No. AAO38049X which is encoded by the nucleic 
S acid sequence in SEQ ID NO: 23 . 

SEQ ID NOs: 25-29 show primers directed to the commonly occurring P450 oxygenase 
PERF motif and its variant forms. 

SEQ ID NOs: 30-31 show primers directed to the conserved P450 oxygenase heme-binding 

region. 

10 SEQ n> NOs: 32-33 show primers suitable for anq)lifying a nucleic acid sequence encoding 

a taxoid 5oe-hydroxykise. 

SEQ ID NO: 34 shows a nucleic acid sequence encoding a 10-deacetyIbaccatin 
m-lO-O-acetyl transferase (also called. TAX6 or DBAT) (GenBank Accession No. AF193765; e.g.. 
Walker and Croteau, Proc. Natl Acad. Set USA, 97(2):583-587, 2000) and its corresponding amino 

15 acid sequence. 

SEQ ID NO: 35 shows the amino acid sequence of a lO-deacetylbaccatin lU-lO-O-acetyl 
transferase^ which is encoded by the nucleic acid sequence in SEQ ID NO: 33. 

SEQ ID NO: 36 shows a nucleic acid sequence encoding a taxoid 
13-phenylpropanoyltransferase (also called, TAX7) (GenBank Accession No. AY082804, Walker et 
20 a/., Proc. Nad. Acad. Sci. USA, 99(20):1271S-12720, 2002) and its corresponding ammo acid 
sequence. 

SEQ ID NO: 37 shows the amino acid sequence of a taxoid 13-phenylpropanoyltransferase» 
which is encoded by the nucleic acid sequence in SEQ ID NO: 35. 

SEQ ID NO: 38 shows a nucleic acid sequence encoding a taxoid 3*-N-debenzoyltaxol 
25 N-benzoyltransferase (also called, TAXIO or DBNTBT) (GenBank Accession No» AF466397; 

Walker et al., Proc, Natl Acad, Sci. USA, 99(14):9166-9171, 2002) and its corresponding amino acid 
sequence. 

SEQ ID NO: 39 shows the amino acid sequence of a taxoid 3'*N-debenzoyltaxol 
N-benzoyltransferase» which is encoded by tiie nucleic acid sequence in SEQ ID NO: 37. 
30 SEQ ID NO: 40 shows a xmcleic acid sequence encoding a taxoid 2os-hydroxylase 

((GenBank Accession No. AY5 18383; Chau and Croteau, Arch. Biochem. Biophys., 427(l):48-57, 
2004) and its corresponding amino acid sequence. 

SEQ ID NO: 41 shows the amino acid sequence of a taxoid 2ot-hydroxylase, which is 
encoded by the nucleic acid sequence in SEQ ID NO: 39. 



35 



DETAILED DESCRIPTION 

/• Abbreviations and Terms 

CX) carbon monoxide 

EUSA enzyme-linked immunosorbent assay 
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QC-MS gas chromatography-mass spectroscopy 

HPLC high perfonnance liquid chromatography 

HSQC heteroauclear single quantum coherence 

KIE kinetic isotope effect 

S kDa kiiodaltons 

MW molecular weight 

NMR nuclear magnetic reasonance spectroscopy 

ORF open reading frame 

RACE rapid analysis of cDNA ends 

1 0 ROESY rotational nuclear overhauser effect spectroscopy 

TLC thin layer diromatography 

TOCSY total correlated spectroscopy 



Unless otherwise noted, technical terms are used according to conventional usage. 

1 5 Definitions of conmion terms in molecular biology may be foimd in Benjamin Lewin, Genes 
published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al (eds.). The 
Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182- 
9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk 
Reference^ published by VCai Publishers, Inc., 1995 (ISBN 1-56081-569-8). 

20 In order to facilitate review of the various embodiments disclosed herein, the following 

explanations of specific terms are provided: 

Amplification: When used in reference to nucleic acids, techniques diat increase the 
number of copies of a nucleic acid molecule in a sanople or specimen. An exaiiq)le of amplification is 
the polymerase chain reaction, in which a biological sanqple collected from a subject is contacted wifii 

25 a pair of oligonucleotide primers, under conditions that allow for the hybridization of the primers to 
nucleic acid ten^late in the san^le. The primers are extended under suitable conditions, dissociated 
from the tenc^late, and then re-annealed, extended, and dissociated to anq>lify the number of copies of 
the nucleic acid. The product of in vitro amplification can be characterized by electrophoresis, 
restriction endonuclease cleavage patterns, oligonucleotide hybridization or ligation, and/or nucleic 

30 acid sequencing, using standard techniques. Other examples of in vitro amplification techniques 
include strand displacement ainplification (see U.S. Patent No. 5,744,31 1); transcription-fi:ee 
isodiermal amplification (see U.S. Patent No. 6,033,881); repair chain reaction axiq>lification (see WO 
90/01069); ligase chain reaction anq)lification (see EP-A-320 308); gap filling ligase cham reaction 
amplification (see U.S. Patent No. 5,427,930); coupled ligase detection and PGR (see U.S. Patent No. 

35 6,027,889); and NASBA™ RNA transcrq)tion-free anq)lification (see U.S. Patent No. 6,025, 134). 

Binding or stable binding: An oligonucleotide binds or stably binds to a target nucleic acid 
if a sufiELcient amount of the oligonucleotide forms base pairs or is hybridized to its target nucleic 
acid, to permit detection of that binding. Binding can be detected by either physical or frmctional 
properties of the targetoligonucleotide complex. Binduag between a target and an oligonucleotide 
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can be detected by any procedure known to one of ordinary ddll in flie ait, including both ftinctional 
and physical binding assays. Binding can be detected fiinctionally by determining whether bindmg 
has an observable effect upon a biosynthetic process such as expression of a gene, DNA replication, 
transcription, translation and the like. 

5 Physical methods of detecting the binding of conqplementary strands of DNA or RNA are 

well known in the art, and mclude such methods as DNase I or chemical foo^rinting, gel shift and 
afHnity cleavage assays, Northern blotting, dot blotting and light absorption detection procedures. 
For exan^Ie, one method that is widely used, because it is so simple and reliable, involves observing 
a change in light absorption of a solution containing an oligonucleotide (or an analog) and a target 

10 nucleic acid at 220 to 300 nm as the tenq>erature is slowly increased. If tiie oligonucleotide or analog 
has bound to its target, there is a sudden increase in absorption at a characteristic tenq>eratuie as the 
oligonucleotide (or analog) and die target disassociate fiom each odier, or melt 

The binding between an oligomer and its target nucleic acid is fiequentiy characterized by 
the teirperature (T„J at which 50% of the oligomer is melted fiom its target A higher T„ means a 

15 stronger or more stable conq[>lex relative to a complex with a lower Tn- 

cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments 
(introns) and transcrq>tional regulatory sequences. cDNA can also contain untranslated regions 
(UTRs) that can be responsible for translational control in the corresponding RNA molecule. cDNA 
can be synthesized in the laboratory by reverse transcription from messenger RNA extracted from 

20 cells. 

DNA (deoxsnribonucleic acid): A loxig chain polymer which conq>rises the genetic material 
of most living organisms (some viruses have genes cornprising ribonucleic acid (RNA)). The 
repeating units in DNA polymers are four different nucleotides, each of which conqirises one of the 
foiu: bases, adenine, guanine, cytosine and thymine bound to a deoxyribose sugar to which a 

25 phosphate group is attached. Triplets of nucleotides (referred to as codons) code for each fltntfin add 
in a polypeptide. The term codon is also used for the corresponding (and con[q>lementary) sequences 
of three nucleotides in the mRNA into which the DNA sequence is transcribed. 

Unless otherwise specified, any reference to a DNA molecule is intended to include the 
reverse complement of that DNA molecule. Except where single strandedness is required by the text 

30 herein, DNA molecules, though written to depict oidy a single strand, enconapass botii strands of a 
double-stranded DNA molecule. Thus, a reference to the nucleic acid molecule that encodes a 
specific protein, or a fragment thereof enconqpasses both the sense strand and its reverse 
comqplement Thus, for instance, it is apprcipriate to generate probes or primers from the reverse 
conq)lement sequence of the disclosed nucleic acid molecules. 

35 Encode: A polynucleotide is said to "encode" a polypeptide if, in its native state or when 

manipulated by methods well known to tiiose of ordinary skill in the art, it can be transcribed and/or 
translated to produce the mRNA for and/or the polypeptide or a fragment thereof. The anti>sense 
strand is the conq)lement of such a nucleic acid, and the encoding sequence can be deduced 
therefrom. 
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Functional Aragments and variants of a polypeptide: Included are those fragments and 
variants dmtnmintain one or niore functions of the parent polypepti^ It is recognized that the gene 
or cDNA encoding a polypeptide can be considerably mutated without materially altering one or more 
the polypeptide's fimctions. Firsts the genetic code is well-known to be degenerate, and thus different 

S codons encode the same amino acids. Second, even where an amino acid substitution is introduced, 
the mutation can be conservative and have no material impact on the essential functions of a protein 
(see, Stryer, Biochemistry, Third Edition, W.H, Freeman and Company, New York, N.Y., p. 769, 
1988). Third, part of a polypeptide chain can be deleted without impairing or eliminating all of its 
functions. Fourth, insertions or additions can be made in the polypeptide chain for exan^le, adding 

10 epitope tags, without inq)airing or eliminatii^ its fimctions (Ausubel et aL, Current Protocols in 
Molecular Biology^ Greene Publ. Assoc. and Wiley-Intersciences, 1997; Jennewem et a/.. Arch. 
Biochem. Biophys.^ 413:262-270, 2003). Other modifications diat can be made without materially 
in9>airing one or more functions of a polypq>tide include, for exanaple, in vivo or in vitro chemical 
and biochemical modifications or Ae incorporation of unusual amino acids. Such modifications 

IS include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquination, 
labeling, e.g., with radionucleides, and various enzymatic modifications, as will be readily 
appreciated by ordinarily skilled artisans. A variety of methods for labeling polypeptides and labels, 
which are useful for such purposes are well known in the art, and include radioactive isotopes such as 
^^P, ligands that bind to or are bound by labeled specific binding partners {e.g., antibodies), 

20 fluorophores, chemiluminescent agents, enzymes, and antiligands. Functional fragments and variants 
can be of varying lengdL For example, some fimctional fragments have at least 75,100, 200, 300 or 
400 amino acid residues. 

A functional fragment or variant of a disclosed P450 o?Qrgenase, such as a taxoid 
Sa-liydroxylase, is defined herein as a polypeptide capable of oxidizing (for exanaple, hydroxylating 

25 or epoxidizing) a taxoid. In specific examples, a functional fragment or variant oxidizes a taxoid at 
the C5 position. It includes any polypeptide of about 100 or more amino acid residues in length, 
which is capable of having taxoid oxygenase activity. 

Heterologous: A type of sequence that is not normally in the wild-type sequence) 
found adjacent to a second sequence. In one embodiment, the sequence is from a different genetic 

30 source, such as a virus or organism, than the second sequence. 

Host cell: Any cell that is capable of being transformed with a recombinant nucleic acid 
sequence. For exan^le, bacterial cells, fungal cells, plant cells (such as, Taxus cells), insect cells, 
avian cells, mammalian cells, and anq)hibian cells. A host cell can be isolated or it can exist as a part 
of a transgenic organism (such as, microorganism (or lower life form) or a macroorganism). In 

35 specific exan^les, a host cell can be a primary cell or a cell line. A primary cell is a cell that is taken 
directly from a living organism, such as a plant (eg., a plant from the genus Taxus), which is not 
immortalized. The temi '*cell line" refers to a cell that is able to replicate in culture. Some cell lines 
(often called immortal cells) are capable of an essentially unlimited number of cell divisions. A 
primary cell may become a cell line upon continuous culture. In some instances, an immortal cell can 
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arise spontaneously, for example, as a result of uncbaracterized alterations in the cell genome. In 
other case, a cell, such as a primary cell, can be made immortal using techniques commonly kno^ in 
the art, including transfection with SV40 T-antigen or telomerase reverse transcriptase (TBRT) (for 
review, e.g., Hahn, Mol. Celb, 13(3):3S 1-361, 2002). 

5 Hybridization: Oligonucleotides and oUiermicleic acids hybridize by hydrogen bonding, 

which includes Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen bonding, between 
con^)lementary bases. Generally, nucleic acid consists of nitrogenous bases that are either 
pyrimidines (cytosine (C), uracil (U), and thymine (T)) or purines (adenine (A) and guanine (G)). 
These nitrogenous bases form hydrogen bonds between a pyrimidine and a purine, and the bonding of 

10 the pyrimidine to the purine is referred to as base pairing. More speciJBcally, A will hydrogen bond 
to T or U, and G will bond to C. Complementary refers to the base paumg that occurs between to 
distinct nucleic acid sequences or two distinct regions of tfie same nucleic acid sequence. 

Specifically hybrldlzable and speciflcally complementary are terms that indicate a 
;sufiicient degree of conq>lementarity such that stable and specific binding occurs between a first .. 

IS nucleic acid (such as, an oligonucleotide) and a DNA or RNA target The first nucleic acid (such as, 
an oligonucleotide) need not be 100% con^lementaiy to its target sequence to be specifically 
hybridizable. A first nucleic acid (such as, an oligonucleotide) is specifically hybridizable when tiiere 
is a sufficient degree of con^lementarity to avoid non-specific binding of the first nucleic acid (such 
as, an oligonucleotide) to non-target sequences under conditions where specific binding is desired. 

20 Such binding is referred to as specific hybridization. 

Hybridization conditions resulting in particular degrees of stringency will vary depending 
iqpon die nature of the hybridization method of choice and the conq)osition and length of the 
hybridizing nucleic acid sequences. Generally, the temperature of hybridization and the ionic 
strengdi (especially the Na*** concentration) of the hybridization buffer will determine the stringency of 

25 hybridization, fiiough waste times also influence stringency. Calculations regarding hybridization 
conditions required for attaining particular degrees of stringency are discussed by Sambrook et al 
(ed.), Molecular Cloning: A Laboratory Manual^ 2nd ed., voL 1-3, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY, 1989, chapters 9 and 1 1 . 

The following are exennplary sets of hybridization conditions and are not meant to be 

30 limiting. 

Very High Stringency (detects sequences feat share 90% sequence identitv^ 
Hybridization: 5x SSC at SS^'C for 16 hours 

Wash twice: 2x SSC at room teinperature (RT) for 1 5 minutes each 

Wash twice: 0.5x SSC at 65^C for 20 minutes each 

35 

High Stringency (detects sequences that share 80% sequence identity or greater^ 
Hybridization: 5x-6x SSC at 65**C-70*C for 16-20 hours 

Wash twice: 2x SSC at RT for 5-20 minutes each 

Wash twice: Ix SSC at 55**C-70*'C for 30 minutes each 

40 
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Low Stringency (detects aeauencea that share gpyflttT ftffll aeouenc^i ^ei^tiitv'^ 
Hybridization: 6x SSC at RT to SS^'C for 1 6-20 hours 

Wash at least twice: 2x-3x SSC at RT to SSX for 20-30 minutes each. 

5 Isolated: A biological conqponent (such as a nucleic acid molecule, protein or organelle) 

that has been substantially separated or purified away from other biological conqponents in the cell of 
die organism in which the component naturally occurs, /.e, other chromosomal and extra- 
chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been 
isolated include nucleic acids and proteins purified by standard purification methods. The term also 

10 embraces nucleic acids and protems prepared by recombinant e^^ression in a host cell as well as 
chemically synthesized nucleic acids. 

Nucleotide: This term includes, but is not limited to, a monomer that includes a base linked 
to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino 
acid, as in a i)eptide nucleic acid (PNA). A nucleotide is one monomer in a polynucleotide. A 

IS nucleotide sequence refers to die sequence of bases in a polynucleotide. 

Oligonucleotide: A plurality of joined nucleotides joined by native phosphodiester bonds, 
between about 6 and about 300 nucleotides in length. An oligonucleotide analog refers to moieties 
tiiat function similarly to oligonucleotides but have non-naturally occurring portions. For exanqile, 
oligonucleotide analogs can contain non-naturally occurring portions, such as altered sugar moieties 

20 or inter-sugar linkages, such as a phosphorothioate oligodeoxynucleotide. Functional analogs of 
naturally occurring polynucleotides can bind to RNA or DNA, and include peptide nucleic acid 
(PNA) molecules. 

Particular oligonucleotides and oligonucleotide analogs can include linear sequences up to 
about 200 nucleotides in length, for exanqile a sequence (such as DNA or RNA) that is at least 6 

25 bases, for example at least 8, 10, 15, 20, 25, 30, 35, 40, 45, 50, 100 or even 200 bases long, or 6am 
about 6 to about 50 bases, for example about 1 0-25 bases, such as 12, 1 5 or 20 bases. 

Operably linked: A first nucleic acid sequence is operably linked with a second nucleic 
acid sequence when the first nucleic acid sequence is placed in a functional relationship with the 
second nucleic acid sequence. For instance, a pronooter is operably linked to a coding sequence if the 

30 promoter afTects the transcription or expression of the coding sequence. Generally, operably linked 
DNA sequences are contiguous and, where necessary to join two protein-coding regions, in tiie same 
reading j&ame. 

Open reading frame (ORF): A series of nucleotide trq>lets (codons) coding for amino 
acids without any internal termination codons. These sequences are usually translatable into a 
35 peptide. 

Organism: Any individual living thing, whether unicellular or multi-cellular and including 
all members of Archaea, Bacteria, and Bukaryota taxonomical classifications, such as plants, yeast, 
bacteria, fungi, and insects. 
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Ortholog: Two nucleic acid or amiiio acid sequences are oithologs of each other if diey 
share a common ancestral sequence and diverged when a species carrying that ancestral sequence 
qplit into two species. Qrthologous sequences are also homologous sequences. 

Oxidation: The process of incorporating oxygen into a molecule, such a substrate of a P4S0 

5 oxygenase. Specific types of oxidation include, for example, epoxidation and hydroxylation, ' ' 
**Epoxldation'* involves a chemical reaction in which an oxygen atom is joined to an olefinically 
unsaturated molecule to form a cyclic, three-membered ether. An **oIe£in*' is a hydrocarbon 
containing a carbon-carbon double bond. ''Hydroxylation*' is a chemical reaction in which a 
hydroxyl (-OH) group is incorporated into a molecule. 

10 Oxygenase activity: Enzymes exhibiting oxygeruu^ activity are capable of directly 

incorporating oxygen into a substrate molecule. The process of incorporating oxygen into a substrate 
molecule is called '^oxidation.*' Oxygenases can be either dioxygenases, in which case the oxygenase 
incorporates two oxygen atoms into the substrate; or, monooxygenases, in which oidy one oxygen 
atom is incorporated into die primary substrate, for exanq>le, to form a hydroxyl or epoxide group. 

IS Monooxygeimses also may be referred to as ''hydroxylases." Taxoid oxygenases are a subset of 

oxygenases that specifically utilize taxoids as substrates. Taxoid oxygenases can utilize, for example^ 
taxoid substrates having a methylene group at any position, including for example, taxoids having a 
S-methylene group (such as, taxoid 5o(-hydroxylases), taxoids having a 2-methylene group (such as, 
taxoid 2oe-hydroxylases), taxoids having a 7-mediylene group (such as, taxoid 7/3^hydroxylases), 

20 taxoids having a 10-mediylene group (such as, taxoid lOjS-hydroxylasess), taxoids having a 

13-methylene group (such as, taxoid 13a-hydroxylases), or taxoids having a 14-mettiylene group 
(such as, taxoid 14/^hydroxylases). 

Oxygenases: Oxygemises are enzymes that display oxygenase activity as describe above. A 
particular oxygenase may recognize one or more substrates. An oxygenase that will recognize more 

25 than one substrate is said to have ''relaxed substrate specificity." Different oxygenases may 

recognize the same substrate and have "shared substrate specificity.*' Oxygenase enzyme activity 
assays may utilize one or more different substrates depending on the specificity(ies) of the particular 
oxygenase enzyme. One of ordinary skill in the art will appreciate that a variety of general 
oxygenase activity assays, including, for instance, the spectrophotometry-based assay described 

30 herein, are available, and that direct assays can be used to test oxygenase catalysis directed towards 
different substrates. 

Polypeptide: A polymer in which the monomers are amino acid residues joined together 
through amide bonds. When the amino acids are alpha-amino acids, either the L-optical isomer or the 
D-optical isomer can be used, the L-isomers being preferred The term polypeptide or protein as used 
35 herein enconq)asses any amino acid sequence and includes modified sequences such as glycoproteins. 
The term polypeptide is specifically intended to cover naturally occurring proteios, as well as those 
that are reconibinantly or synthetically produced. The term(s) "isolated polypeptide" (or isolated 
protein) as used herein refers to a polypeptide that is substantially free of other proteins, lipids, 
carbohydrates or other materials with which it is naturally associated. In one embodiment, die 
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polypeptide is at least 50%, for example at least 70%, at least 80%, at least 90%, or at least 95%, free 
of other proteins, lipids, caibohydrates or other materials with which it is naturally associated. 

Probes and primers: Nucleic acid probes and primers can be readily prepared based on the 
nucleic acid molecules provided in this disclosure. A probe comprises a detectable isolated nucleic 

5 acid. In some instances, a probe is directly attached to a detectable label or reporter molecule. 

Typical labels include radioactive isotopes, enzyme substrates, co-factors, ligands, chemiluminescent 
or fluorescent agents, haptens, and enzymes. Methods for labeling and guidance in the choice of 
labels appropriate for various purposes are discussed, e.^., in Sambrook et al. (In: Molecular 
Cloning: A Laboratory Manual^ Cold Spring Harbor Laboratory Press, New York, 1989) and 

10 Ausubel et aL (In: Current Protocols in Molecular Biology^ Greene Publ. Assoc. and Wiley- 
Intersciences, 1992). 

Primers are short nucleic acid molecules, such as DNA oligonucleotides 10 nucleotides or 
more m length. Longer DNA oligonucleotides can be about 15, 17, 20, or 23 nucleotides or more in 
length. Primers can be annealed to a.complementary target DNA strand by nucleic acid Iqrbridization 

15 to form a hybrid between the primer and die target DNA strand, and then the primer extended along 
the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for anq)lification of a 
nucleic acid sequence, e.^., by die polymerase chain reaction (PCR) or other nucleic-acid 
axnplification methods known in the art. 

Mefliods for preparing and using probes and primers are described, for exanq)le, in 

20 Sambrook et aL (In Molecular Cloning: A Laboratory Manual^ Cold Spring Harbor Laboratory 
Press, New York, 1989), Ausubel et al. Qn Current Protocols in Molecular Biology^ Greene PubL 
Assoc. and Wiley-Intersciences, 1998), and Innis et aL (PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc., San Diego, CA, 1990). PGR primer pairs can be derived from a 
known sequence, for example, by using con^uter programs intended for that purpose such as Piinaer 

25 (Version 0.5, © 1991, Whitehead Institute for Biomedical Research, Cambridge, MA). One of 

ordinary skill in the art will appreciate that the specificity of a particular probe or primer increases 
with its length. Thus, in order to obtain greater specificity, probes and primers can be selected that 
comprise at least 17, 20, 23, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides. 

Protein: A biological molecule e^^ressed by a gene and conq>rised of amino acids. 

30 Purified: The term purified does not require absolute purity; radier, it is intended as a 

relative term. Thus, for exantqile, a purified protein preparation is one in which the protein referred to 
is more pure than the protein in its natural environment within a cell. For exan^Ie, a preparation of 
an enzyme can be considered as purified if die enzyme content in the preparation rq>resents at least 
50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95% of the total protein content 

35 of the preparation. 

Recombinant: A nucleic acid that has a sequence that is not naturally occurring or has a 
sequence that is made by an artificial combination of two otherwise separated segments of sequence. 
This artificial combination can be acconq)lished by chemical synthesis or, more commonly, by the 
artificial manipulation of isolated segments of nucleic acids, by genetic engineering techniques. 
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^^Recombinanr also is used to describe nucleic acid molecules diat have been artificially nmnipulated, 
but contain the same control sequences and codii^ regions that are found in the oiganism fiom which 
the gene was isolated. 

Sequence identity: The similarity between two nucleic acid sequences or between two 

S amino acid sequences is expressed in terms of the level of sequence identity shared between the 
sequences. Sequence identity is typically expressed in terms of percentage identity; the higher the 
percentage, the more similar the two sequences. 

Methods for aligning sequences for con^arison are well known in the art Various programs 
and alignment algorithms are described in: Smith and Waterman, Ady. Appl Math. 2:482, 1981; 

10 Needleman and Wunsch, J. Mol Biol 48:443, 1970; Pearson and Lipman, Proc. Natl. Acad. Sci. USA 
85:2444, 1988; Higgins and Sharp, Gene 73:237-244, 1988; Higgins and Sharp, CABIOS 5:151-153, 
1989; Coipet et a/.. Nucleic Acids Research 16:10881-10890, 1988; Huang, et al.^ Computer 
Applications in the Biosciences 8: 155-165, 1992; Pearson et al^ Methods in Molecular Biology 
24:307-331, 1994; Tatiana et al, (1999), FEMS Microbiol. Lett., 174347-250, 1999. Altschul et al. 

15 present a detailed consideration of sequence-alignment mediods and homology calculations (/. MoL 
B/o/. 215:403-410, 1990). 

The National Center for Biotechnology Information (NCSI) Basic Local Alignment Search 
Tool (BLAST™, Altschul et al. J. Mol Biol 215:403-410, 1990) is available from several sources, 
including the Natimal Center for Biotechnology Information (NCBI, Bethesda, MD) and on die 

20 Internet; for use in connection with die sequence-analysis programs blasip, blastn, blastx, tblastn and 
tblastx. A description of how to determine sequence identity using this program is available on the 
intemet under the help section for BLAST™. 

For conq)arisons of amino acid sequences of greater than about 30 amino acids, die "Blast 2 
sequences'* function of the BLAST™ (Blastp) program is employed using the default BLOSUM62 

25 matrix set to de&ult parameters (cost to open a gap [de&ult = 5]; cost to extend a gap [default = 2]; 
penalty for a mismatch [defeult = -3]; reward for a match [default = 1]; expectation value (E) 
[default = 10.0]; word size [default = 3]; number of one-line descriptions (V) [default = 100]; number 
of alignments to show (B) [default = 100]). When aligning short peptides (fewer than around 30 
amino acids), the alignment should be performed using the Blast 2 sequences function, employing the 

30 PAM30 matrix set to de&ult parameters (open gap 9, extension gap 1 penalties). Proteins with even 
greater similarity to the reference sequences will show increasing percentage identities when assessed 
by this method, such as at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 
90%, or at least 95% sequence identity. 

For con^arisons of nucleic acid sequences, the **Blast 2 sequences" function of the 

35 BLAST™ (Blastn) program is employed using the default BLOSUM62 matrix set to default 

parameters (cost to open a gap [default = 1 1]; cost to extend a gap [default =1]; expectation value (E) 
[default = 10.0]; word size [default = 11]; number of one-line descriptions (V) [default = 100]; 
nimiber of aligmnents to show (B) [default = 100]). Nucleic add sequences with even greater 
similarity to the reference sequences will show increasing percentage identities when assessed by this 



wo 2005/010166 



PCT/US2004/023656 



-16- 

method, such as at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at 
least 95%, or at least 98% sequence identity. 

An alternative indication that two nucleic acid molecules are closely related is that the two 
molecules hybridize to each other under stringent conditions (see ^Hybridization** above). 

5 Nucleic acid sequences that do not show a high degree of identity can nevertheless encode 

similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that 
changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
molecules that all encode substantially the same protein. 

Specific binding agent: An agent that binds substantially only to a defined target For 

10 example, a taxoid Sa-hydroxylase protein-specific binding agent binds substantially only the taxoid 
5a-hydroxylase protein. 

Antibodies are exemplar specific binding agents. Antibodies can be produced using standard 
procedures described in a number of texts, inchiding Harlow and Lane {Antibodies, A Laboratory 
Manual, Cold Spring Harbor Laboratory Press, New York, 1988). Shorter fiagments of antibodies 

15 can also serve as specific binding agents, including, for mstance, Fabs, Fvs, and single-chain Fvs 
(SCFvs). Antibody fragments are defined as follows: (1) Fab, the firagment which contains a 
monovalent antigen-binding firagment of an antibody molecule produced by digestion of whole 
antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) 
Fab', the fragment of an antibody molecule obtained by treating whole antibody with pepsin, 

20 followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab* 

fiagments are obtained per doaldbody molecule; (3) (Fab')2, the firagment of &e antibody obtained by 
treating whole antibody with the enzyme pepsin without subsequent reduction; (4) F(ab*)2, a dimer of 
two Fab* firagments held together by two disulfide bonds; (5) Fv, a genetically engineered fiagment 
containing die variable region of the light chain and tfie variable region of the heavy chain expressed 

25 as two chains; and (6) single chain antibody (SCA), a genetically engineered molecule containing the 
variable region of the light chain, the variable region of the heavy chain, linked by a suitable 
polypeptide linker as a genetically fiised single chain molecule. Methods of making these fragments 
are routine. 

Substrate: A molecule that binds to an enzyme, such as a P450 oxygenase, and undergoes a 
30 chemical change, such as oxidation, during the ensuing enzymatic reaction. Exen^lar substrates for 
tibie disclosed P450 oxygenases are described dsroughout this specification. An ^'exogenous 
substrate*' is a substrate that is added to a particular type of celL 

Taxadien-5-ol transacylase activity: Capable of transferring an acyl groi^ (such as an 
acetyl group) from an acyl carrier (such as acetyl-CoA) to a taxoid substrate coi]:q>rising a hydroTQfl 
35 group at C5 (such as taxadien-5a-ol); for additional details, see, eg., U.S. Pat No. 6,287,835. 

Taxadien-2-ol transacylase activity: Capable of transferring an acyl group (such as a 
benzoyl group) from an acyl carrier (such as benzoyl CoA) to a taxoid substrate conq)rising a 
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hydroxyl groiq) at C2 (such as 2-debeiizoyl-7,13-diacetylbaccatin m); for additional details, see, e.g., 
PCT Pub. No. WO 01/23S8& 

Taxadlene synthase activity: Capable of cyclizing geranylgeranyl diplioq>hate to produce 
taxadiene, as described in detail, eg., in U.S. Pat Nos. 6,610,527; 6,1 14,160; and 5,994,1 14. 
5 Taxoid: A chemical based on the taxane ring stnicture 

(pentamethyl[9.3.1.0]'*^tricyclopentadecane). The core taxane ring structure is described, for 
exanqjle, in Kingston et al. Progress in the Chemistry of Organic Natural Products^ Springer-Verlag, 
1993, and has the chemical structure: 




10 Exen^laiy taxoids are described throughout the specification and also include, without limitation, 
taxadiene, taxadienyl acetate (including, eg., taxa-5a-yl acetate), taxa-4(5),l l(12)-diene, taxa- 
4(20),ll(12)-diene, taxadien-Sc3K>l, taxa-4(20),ll(12).dien-5,13-diol, 5<v-acetoxy-10/3; HjlMihydioxy 
taxadiene, 2-debenzoyl taxane, 10-deacetyl baccatin m, baccatin HI, 3'-N-debenzoyltaxol, taxa- 
4(20),1 l(12)-dien.5o^9o;10/J.triol, taxa-4(20),ll(12)-dien.2o;5cNliol (and diacetate ester); taxa- 

15 4(20),1 l(12)-dien-5o;9o;10ft 13o!-tetraol and corresponding tetraacetate (taxusm tetraol and taxusin, 
respectively), taxa-4(20),l l(12)-dien-5o;9c^diol (and monoacetate and diacetate); taxa-4(20),l 1(12)- 
dien-5Q;10i8-diol (and monoacetate and diacetate); taxa-4(20),l I(12)-dien-5c49c^l0j3-triol (and 
acetate esters); and a taxoid having a 5-methylene group (R-CH2-R). 

Transfected: A process by which a nucleic acid molecule is introduced into cell, for 

20 instance by molecular biology techniques, resulting in a transfected (or transformed) cell. As used 
herein, tiie term transfection enconqpasses all techniques by Which a nucleic acid molecule might be 
introduced into such a cell, including transduction with viral vectors, transfection with plasmid 
vectors, and introduction of DNA by electroporation, l^ofection, and particle gun acceleratioxL 
Vector: A nucleic acid molecule as introduced into a host cell, tiiereby producing a 

25 transfected (or transformed) host cell. A vector can mclude nucleic acid sequences that permit it to 
replicate in a host cell, such as an origin of replication. A vector can also include one or more 
selectable marker genes and other genetic elements known in the art 

Unless otherwise e^lained, all technical and scientific terms used herein have the same 
30 meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. 
The singular terms "a,** "an," and ••flie" include plural referents unless context clearly indicates 
otherwise. Similarly, tiie word "of is intended to include "and" unless the context clearly indicates 
otherwise. Hence "coiiq)rismg A or B" means •including A or B," or "including A and B." It is 
further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular 



wo 2005/010166 



PCT/US2004/023656 



-18- 

mass values^ given for nucleic acids or polypeptides are approximate, and are provided for 
description. Although methods and materials similar or equivalent to tiiose described herein can be 
used in the practice or testing of the present invention, suitable methods and materials are described 
below. All publications, patent applications, patents, and other refeiences mentioned herein are 
S incorporated by reference in their entirety. In case of conflict, the present specification, including 
explanations of terms, will control. In addition, the materials, methods, and exanq>les are illustrative 
only and not intended to be limiting. 

J7. Description of Several Specific Embodiments 
10 Disclosed herein are isolated proteins having taxoid oxygenase activity (such as, taxadiene 

hydroxylation activity) and tiie nucleic add sequences encoding such proteins (including, for 
exanq;>le, SBQ ID NO: 1). In some embodiments, the protein comprising an amino acid sequence 
having at least 80% or at least 95% sequence identity to SBQ ID NO: 2 or conoprises the sequence in 
SBQ ID NO: 2. 

15 Isolated nucleic acid molecules tiiat (i) hybridize under high (or very high) stringency 

conditions with a nucleic acid probe comprising at least 600 base pairs of SEQ ID NO: 1 and (ii) 
encode a protein having taxoid oxygenase activity are also contemplated by this disclosure; as are the 
taxoid oxygenase proteins encoded by such nucleic acid molecules. 

Rniher provide herein are isolated nucleic acid molecules having a sequence at least 80% 

20 identical to Ifae nucleic acid sequence in SBQ ID NO: 1 and encoding a protein having taxoid 

oxygenase activify, such as taxoid 5cie-hydroxylase activity. A protein encoded by such a nucleic acid 
molecule is also disclosed. 

Also provided are recombinant nucleic acid molecules, which include a promoter sequence 
operably linked to a nucleic acid molecule encoding a disclosed taxoid oxygenase protein (such as, a 

25 taxoid 5ot-hydroxylase). In certain exanq)les, a cell (such as, a plant cell (including a Taxus cell or 
cell line), an insect cell, a bacterium, or a yeast cell) or a non-human transgenic organism (such as a 
plant, including a plant from the genus Taxus) are transformed with the recombinant nucleic acid. In 
particular examples, the cell is an isolated cell, such as a cell line. 

This disclosure includes method of identifying a nucleic acid sequence that encodes a taxoid 

30 oxygenase, which involve (i) hybridizing a probe to a nucleic acid sequence under high (or very high) 
stringency conditions, wherein the probe conoprises at least 600 contiguous nucleotides of SEQ ID 
NO: 1; and (ii) determining tiiat a protein encoded by the nucleic acid sequence is capable of 
oxidizing a taxoid substrate. A protein capable of oxidizing a taxoid substrate is thereby identified as 
a taxoid oxygenase. In some exanq>les, oxidizing the taxoid substrate involves hydroxylating the 

35 taxoid substrate. 

Metiiods of hydroxylating a substrate are also disclosed. Such methods involve contacting a 
substrate with at least one oxygenase having an amino acid sequence at least 95% identical to SEQ 
ID NO: 2 (or having the sequence of SEQ ID NO: 2); and allowing the oxygenase to oxidize tiie 
substrate. In some niethods, oxidation ofthe substrate involves hydroxylation of the substrate. In 
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other methods* the substrate is a taxoid (such as, paclitaxel, a paclitaxel intennediate» a taxadiene, 
taxa-4(5)»l l(12)-diene or taxa-4(20),l l(12)-diene). In some cases* hydroxylation occurs at position 
CS of the taxoid. In specific embodiments, the oxygenase is expressed in an isolated cell or in a 
transgenic plant* bacterium, insect, fimgus or yeast, and the hydioxyhition of the substrate occurs in 

S vivo. In other embodiments, the substrate is an exogenous substrate, which is fed to the isolated cell, 
transgenic plant, transgenic bacterium, transgenic insect, transgenic fungus or transgenic yeast 

Also provided herein are methods for increasing paclitaxel yield in a cell (such as a Taxtds 
cell, including, for exanq>le, a Taxtis cell line), which involve introducing any of the taxoid 
oxygenase-encoding recombinant nucleic acid molecules disclosed herein into a paclitaxel-producing 

10 cell, wherein the production of paclitaxel is increased in the cell followmg the introduction of the 
recombinant nucleic acid molecule. In a particular exaoc^le, the recombinant nucleic acid molecule 
that is introduced into die cell includes die nucleic acid sequence in SBQ ID NO: 1. In some 
examples of this method, the amount of paclitaxel produced by the cell is at least four fold higher 
foUowing introduction oftfaerecond>mant nucleic acid molecule into the cell. In more specific & * 

IS exanq>les, methods for increasing paclitaxel yield in a cell (such as a Taxus cell, including, for 

exanq>le, a Taxus cell line) furdier involve introducing additional nucleic acid niiolecules into the celL 
Exen^lar additional nucleic acid molecules include those: (i) encoding a protein having 
taxadiene synthase activity (e.g., nucleic acid molecules having at least 90% sequence identity to 
SEQ ID NO: 19 (or its protein-coding region), and encoding a protein having taxadiene synthase 

20 activity); (ii) encoding a protein havii^ taxadien-S-ol transacylase activity (e.g., nucleic acid 

molecules havixig at least 90% sequence identity to SEQ ID NO: 21 (or its protein-coding region), 
and encoding a protein having taxadien-5-ol transacylase activity); (iii) encoding a protein having 
taxadien-2-ol transacylase activity (eg., nucleic acid molecules having at least 90% sequence identity 
to SEQ ID NO: 23 (or its protein-coding region), and encoding a i»rotein having taxadien-2-ol 

25 , transacylase activity); (iv) encoding a protein having taxoid oxygenase activity (such as, taxoid 
7/S-hydroxylase activity, taxoid 14j3-hydroxylase activity, taxoid lOj^hydroxylase activity, taxoid 
13Qf-hydroxylase activity, or taxoid 2of-hydroxylase activity) (eg., nucleic acid molecules having at 
least 90% sequence identity to any one of the sequences (or their respective protein-coding regions) 
set forth in SEQ ID NOs: 3, 5, 7, 9, 1 1, 13, 15, 17, or 40 and encoding a protein having taxoid 

30 oxygenase activity (such as, taxoid 7/3-hydroxylase activity (e.g., SEQ ID NO: 7), taxoid 

14i3-hydroxytese activity (e.g., SEQ ID NO: 1 1), taxoid lOjS-hydroxylase activity (e.g., SEQ ID 
NO: 15), or taxoid 13cx-hydroxylase activity (eg., SEQ ID NO: 17), or taxoid 2ofr-hydroxylase activity 
(eg., SEQ ID NO: 40)); or (v) combinations of (i), (ii), (iii), or (iv). In specific methods, tiie 
additional nucleic acid molecules conq>ri5e one or more of the nucleic acid sequences set forth in 

35 SEQ ID NOs: 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23 or 40 (or any combination tiiereof). 

This disclosure also provides antibodies or antibody fragments that bind any of the taxoid 
oxygenase proteins, such as a taxoid 5Q5-hydroxylase, described herein. In specific exan^)les the 
antibody is a monoclonal antibody. In other examples, the antibody fragment is a Fab, F(ab)2, or Fv 
firagment, or a combination tiiereof. 
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UI. Toxoid Sa^Hydroxylose Nuelele Acids ond Proteins 

This disclosure provides P4S0 oxygenases, such as a taxoid Sa-hydroxylase, and variants 
diereof, and nucleic acid molecules encodii^ these proteins, including cDNA sequences. 

S A nucleic acid molecule encodmg a taxoid Stt-hydroxylase and the conesponding deduced 

amino acid sequence of taxoid So(-hydroxylase» are shown in SEQ ID NOs: 1 and 2, respectively. 
The nucleic acid molecule encodes a protein of 502 amino acids in lengdi (SEQ ID NO: 2). 

With the provision herein of the sequence of the taxoid So^hydroxylase protein (SEQ ID 
NO: 2) and cDNA (SEQ ID NO: \\ in vitro nucleic acid an^lification (such as polymerase chain 

10 reaction (PGR)) noay be utilized as a single method for producing taxoid 5of-hydroxylase encoding 
sequences. The following provides representative techniques for preparing cDNA in this manner. 

RNA (such as mRNA or total RNA) is extracted from cells by any one of a variety of 
mediods well known to those of ordinary skill in die art Sambrook et al (In Molecular Cloning: A 
Laboratory Manual^ Cold Spring Harbor Laboratory Press, New Yoik, 1989) and Ausubel et al (In 

IS Current Protocols in Molecular Biology^ Greene Publ. Assoc. and Wiley-Intersciences, 1992) 

provide descriptions of methods for RNA isolation. Taxoid Soe-hydroxylase is e>qxressed, at least, in 
cells from plants of the genus Taxus, Thus, in some exanq)les, RNA may be extracted from Taxus 
cells. The extracted RNA is then used, for example, as a ten^late for performing reverse 
transcription (RT)-PCR an^lification of cDNA. Representative methods and conditions for RT-PCR 

20 are described in Kawasaki et al, (In PCR Protocols, A Guide to Methods and Applications^ Innis et 
al (eds.), 21-27, Academic Press, Inc., San Diego, California, 1990). 

The selection of an^lification primers will be made according to the portion(s) of the cDNA 
Ifaatistobeanq^lified. In one embodiinent, primers inay be chosen to anq>lify a segment of a cDNA 
or, in another embodiment, the entire cDNA molecule. Variations in anpUfication conditions may be 

25 required to accommodate printiers and aisaplicons of differing lengths and conq)osition; such 

considerations are well known in the art and are discussed for instance in Linis et al (PCR Protocols, 
A Guide to Methods and Applications^ Academic Press, Inc., San Diego, CA, 1990). By way of 
example, the coding portion of the taxoid 5of-hydroxylase cDNA molecule (approximately 1509 base 
pairs) may be anq>li£Led using the following combination of primers: 

30 5'-ATGGACGCCCTGTATAAGAG-3' (forward) (SEQ ID NO: 32) 

5'-TCAATTGACTATGGTCTCGG-3' (reverse) (SEQ ID NO: 33) 
These primers are illustrative onl^ one skilled in the art will s^preciate diat many different 
primers may be derived from the provided cDNA sequence in order to anoqplify particular regions of 
taxoid 5c^hydroxylase cDNA, as well as the complete sequence of die taxoid 5c^hydroxylase cDNA. 

35 Re-sequencing of PGR products obtained by anqplification procedures optionally can be 

performed to facilitate confirmation of tiie amplified sequence and provide information about natural 
variation of this sequence in different populations or species. Oligonucleotides derived from die 
provided taxoid 5Gt-hydroxylase sequences may be used in such sequencing methods. 



wo 2005/010166 



PCT/US2004/023656 



.21- 

Orthologs of the disclosed P4S0 oxygenases, such as a taxoid So-hydroxylase, are likely 
present in a number of other numbers of the Taxus genus (such as, T. brevifolia, T. canadensis, 
T. baccata, T. globosa, T.fioridana. T. wallichiana, T. media and 7. chinensis) and other 
taxoid-producing oiganisms (such as, Taxon^es andreanae). With ±e provision of the disclosed 

S oxygenase nucleic acid sequence, the clonii^ by standard methods of cDNAs and genes ttut encode 
oxygenase ordiologs in these other organisms is now enabled. Orthologs of the disclosed oxygenase 
genes have oxygenase biological activity, including for example oxidation (such as, hydroxylation or 
epoxidation) of the C5 position of a taxoid. Orthologs will generally share at least 65% sequence 
identity with the disclosed P450 o^Qrgenase cDNA (for example, SEQ ID NO: 1). Sequence identity 

10 will generally be greater in Taxus species more closely related to Taxus cuspidata. In specific 

embodunents, ortfaol<^ous oxygenase (for exanQ>le, taxoid Sof-hydroxylase) molecules may share at 
least 70%, at least 75%, at least 80% at least 85%, at least 90%, at least 91%, at least 93%, at least 
95%, or at least 98% sequence identity with the disclosed Taxus cuspidata nucleotide or amino acid 
sequences, 

15 Both conventional hybridization and PGR amplification procedures may be utilized to clone 

sequences encoding oxygenase orthologs. Common to both of these techniques is the hybridization 
of probes or primers that are derived from the oxygenase nucleic acid sequences. Furthermore, the 
hybridization may occur in the context of Northem blots, Sou&em blots, or PGR. 

Direct PGR anq>lification may be performed on cDNA or genomic libraries prepared from 

20 the plant species in question, or RT-PCR may be performed using mRNA extracted from the plant 
cells using standard methods. PGR primers will conq>rise at least 10 consecutive nucleotides of the 
oxygenase sequences. One of skill in the art will appieciate that sequence differences between flie 
oxygenase nucleic acid sequence and Oe target nucleic acid to be amplified may result m lower 
amplification efficiencies. To conqiensate for this, longer PGR primers or lower axmealing 

25 ten^eratures may be used during the an^lification cycle. Whenever lower annealing tenq)e]:atures 
are used, sequential rounds of amplification using nested primer pairs may be necessaiy to enhance 
specificity. 

For conventional hybridization techniques the hybridization probe is preferably conjugated 
with a detectable label such as a radioactive label, and the probe is preferably at least 10 nucleotides 

30 in length. As is well known in the art, increasing the length of hybridization probes tends to give 
enhanced specificity. The labeled probe derived from the oxygenase nucleic acid sequence may be 
hybridized to a plant cDNA or genomic hlirary and the hybridization signal detected using methods 
known in the art. The hybridizing colony or plaque (depending on the type of library used) is 
purified and the cloned sequence contained in that colony or plaque isolated and characterized. 

35 Orthologs of the oxygenases alternatively may be obtained by immunoscreening of an 

expression library. With the provision herein of the disclosed oxygenase nucleic acid sequences, the 
enzymes may be expressed and purified in a heterologous expression system (eg., E. colt) and used 
to raise antibodies (monoclonal or polyclonal) specific for oxygenases. Antibodies also may be 
raised against synthetic peptides derived from the oxygenase amino acid sequence presented herein. 
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Methods of laising antibodies are well known in the ait and are described generally in Harlow and 
Lane, Antibodies, A Laboratory Manual^ Cold Springs Harbor^ 1988. Such antibodies can be used to 
screen an expression cDNA library produced from a plant This screening will identify the 
oxygenase ordiolog. The selected cDNAs can be confirmed by sequencing and enzyme activity 
S assays. 

Oh'gonucleotides derived firom the taxoid Soshydroxylase cDNA sequence (e.g., SEQ ID 
NO: 1), or fragments of this cDNA, are encompassed within the scope of the present disclosure. 
Such oligonucleotides may be used, foir exanq)le, as probes or primers. In one embodiment, 
oligonucleotides may conq>rise a sequence of at least 10 consecutive nucleotides of the taxoid 

10 Sos-hydroxylase nucleic acid sequence. If these oligonucleotides are used widi an in vitro 

amplification procedure (such as PGR), lengdiening die oligonucleotides may enhance anq>lification 
specificity. Thus, in other embodunents, oligonucleotide primers conqnising at least 15, 20, 25, 30, 
35, 40, 45, 50, or more consecutive nucleotides of these sequences may be used. 

One of ordmary skill in the art will appreciate that the specificity of a particular probe or 

15 primer increases widi its length. Thus, for example, a primer comprising 30 consecutive nucleotides 
of a 7*0X115 cuspidata taxoid 5a-hydroxylase encoding nucleotide will anneal to a target sequence, 
such as a taxoid 5a-hydroxylase gene homolog present in a cDNA library from anodier Taxus species 
(or odier paclitaxel-producing species), with a higher specificity than a corresponding primer of only 
15 nucleotides. Thus, in order to obtain greater specificity, probes and primers can be selected that 

20 comprise at least 17, 20, 23, 25, 30, 35, 40, 45, 50 or more consecutive nucleotides of taxoid 

5a-hydroxylase nucleotide sequences. In particular examples, probes or primers can be at least 100, 
250, 500, or 600 consecutive nucleic acids of a disclosed 5o^hydroxylase sequence. 

Oligonucleotides (such as, primers or probes) may be obtained fi:om any region of a 
disclosed 5a-hydroxylase nucleic acid sequence. By way of ^cample, the taxoid 5a-hydroxylase 

25 cDNA, ORF and gene sequences may be apportioned into about halves, thirds or quarters based on 
sequence length, and the isolated nucleic acid molecules (&.g., oligonucleotides) maybe derived fix>m 
the first or second halves of the molecules, from any of the three ^rirds, or from any of the four 
quarters. The cDNA also could be divided into smaller regions, e.g. about eighths, sixteenths, 
twentieths, fiftieths and so forth, with similar effect. The taxoid 5c^hydroxylase cDNA shown in 

30 SEQ ID NO: 1 can be used to illustrate ^s. The taxoid 5Qi-hydroxylase cDNA is 1509 nucleotides in 
length and so in one specific embodiment, it may be hypothetically divided into about halves 
(nucleotides 40-794 and 795-1548), in another specific embodiment in about diirds (nucleotides 
40-543, 544-1047, and 1048-1548) or, in yet anodier specific enibodiment, in about quarters 
(nucleotides 40-417, 418-795, 795-1 173 and 1 174-1548). Alternatively, it may be divided into 

35 regions that encode for conserved domains such as, for exan^le, the commonly occurring PERF 

motif and the region surrounding the invariant, heme-binding cysteine residue (von Wachenfeldt and 
Johnson, "Structures of eukaryotic cytochrome P450 enzymes," In: Cytochrome P450: Structure, 
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Mechanism, and Biochemistry. 2iid Ed, P.IL Ortiz do Montellano, ed., New Yotk: Plenum, pp. 183* 
223, 1995). 

/K Cloning of the Toxoid Set'tfydroxytase Gene 

5 The taxoid 5o5-hydroxylase cDNA sequence and fragments described above do not contain 

introns, upstream transcriptional promoter or regulatory regions or downstream transcriptional 
regulatory regions of the taxoid So-hydroxylase gene. The taxoid So-hydroxylase gene may be 
isolated by routine procedures. For instance* the taxoid 5a-hydroxylase gene may be isolated by 
honx>logy screening iising the cDNA sequence and the BLAST prognun. Direct sequencing, using 

10 the **long-distance sequence method," of one or more BAG or PAC clones that contain the taxoid 
Sce-hydroxylase sequence can be enq)loyed. 

Using the information disclosed herein, die regulatory elements flanking the taxoid 
So-hydroxylase gene can be identified and characterized. These regulatory elements may be 
characterized by standard techniques. In one embodiment deletion analysis is performed wherein 

IS successive nucleotides of a putative regulatory region are removed and the effect of the deletions is 
studied by transient e}q)ression analysis. In another embodiment, the effect of the deletions is studied 
by long-term expression analysis. The identification and characterization of regulatory elements 
flanking the genomic taxoid So-hydroxylase gene may be made by functional analysis (deletion 
analyses, etc.) in Taxus cells by either transient or long-term expression analyses. 

20 It will be apparent to one skilled in the art diat either the genomic clone or the cDNA or 

sequences derived from these clones may be utilized in applications, including but not limited to, 
studies of flie expression of the taxoid So^hydroxylase gene, studies of die function of the taxoid 
5a-hydroxylase protein, and the generation of antibodies to die taxoid Sc^hydroxyhise protem. 
Descriptions of applications describing the use of taxoid Sof-hydroxylase cDNA, or fragments fliereof^ 

25 are therefore intended to coiiq)rehend the use of the genomic taxoid 5(x-hydroxylase gene. 

It will also be apparent to one of ordinary skill in the art that taxoid 5c6>hydroxylase genes 
may now be cloned from other Taxus species by standard cloning methods. In one embodiment, such 
orthologous taxoid 5o(-hydroxylase genes will share at least 65% sequence identity with the taxoid 
5o»*hydroxylase nucleic acid disclosed herein; and in odier embodiments, more closely related 

30 orthologous sequences will share at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, 
or at least 98% sequence identity widi this sequence. 

K Taxoid Sa-Hydroxylase Sequence Variants 

With the provision of taxoid 5ctf-hydroxylase protein and corresponding nucleic acid 
35 sequences herein, tiie creation of variants of these sequences is now enabled. Variant oxygenases 

include proteins that differ in amino add sequence from the oxygenase sequences disclosed, but diat 
retain oxygenase biological activity. 

In one embodiment, variant taxoid 5Q(-hydroxylase proteins include proteins that differ in 
amino acid sequence fix>m the taxoid 5Q(-hydroxylase sequences disclosed but that share at least 70% 
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amino acid sequence identity with the provided toxoid So-hydroxybse protein. In other 
enibodinients. other variants wiU share at least 75%, at least 80%, at least 85%, at least 90%, at least 
95%, or at least 98% amino acid sequence identity. Manipulation of the disclosed taxoid 
Soe-hydroxylase nucleotide sequence using standard procedures, includuig hi one specific, non- 
luniting, embodiment, site-directed mutagenesis or m another specific, non-limiting, embodunent. 
PGR. can be used to produce such variants. The simplest modifications involve the substitution of 
one or more ammo acids for amino acids having similar biochemical properties. These so-caUed 
conservative substitutions are Ukely to have muumal unpact on the activity of the resultant protem. 
The followmg table shows exen9>hur conservative amhio acid substitutions: 



Original Residue 


Conservatfye Substitutions 


ala 


Ser 


arg 


Lys 


asn 


Gin; his 


asp 


Glu 


cys 


Ser 


gin 


Asn 


glu 


Asp 


gly 


Pro 


his 


Asn; gin 


ile 


Leu; val 


leu 


ile; val 


lys 


Arg; gin; glu 


met 


Leu; ile 


phe 


Met; leu; tyr 


ser 


Thr 


"S 


Ser 


tip 


Tyr 


tyr 


Trp; phe 


val 


ile; leu 



25 



In some embodiments, the functional identity of a 5a-hydroxylase variant can be maintained 
if anrino acid substitutions are mtroduced m regions outside of the conserved domains of the protein, 
where ammo add substitutions are tess likely to affect protem function. FIG. 6 shows the aUgnment 
of nme taxoid oxygenase ammo acid sequences, mcludmg the 5cif.hydroxylase sequence disclosed 
herem. Shaded ammo add residues are conserved among aU of the illustrated sequences. Incertam 
embodunents. oxygenase variants share the highly conserved (/.e., shaded and marked by asterisk) 
ammo acid residues shown in FIG. 6. FIG. 6 also demonstrates conservative ammo add variations 
(i.e.. marked by ":") among these taxoid oxygenase sequences. In oflier embodunents. oxyesoase 
variants having conservative substitutions (as described in the foregoing table) at the ammo add 
positions indicated by in FIG. 6 are contemplated herein. Amino acid residues that are not highly 
conserved (/.c. shaded or marked by asterisk in FIG. 6) or conservative variations marked by ":" 
m HG. 6) are least likely to be fimctionally relevant and, therefore, may tolerate less conservative 
anrino acid substitutions with little to no effect on the function of the resultant variant. In other 
embodiments, 5a.hydroxylase protem variants may be designed (as discussed above) based on highly 
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conserved and conservative amino acids shown in the alignment of three of tiie foregoing nine amino 
acid sequences, as shown in FIO. 2. 

In another emibodiment» more substantial chaises in Sa-hydroxylase ftmction or other 
protein features may be obtained by selectiiig amino acid substitutions that are less conservative than 
5 conservative substitutions. In one specific, non-limiting, embodiment, such changes include 

changing residues that differ more significantiy in their effect on maintaining polypeptide backbone 
structure (eg., sheet or helical conformation) near the substitution, charge or hydrophobicity of the 
molecule at the target site, or bulk of a specific side chaiiL The following specific, non-limiting, 
exan^les are generally expected to produce the greatest changes in protein properties: (a) a 

10 hydrophilic residue (e.g., seryl or threonyl) is substituted for (or by) a hydrophobic residue (eg., 

leucyl, isoleucyl, phenylalanyl, valyl or alanyl); (b) a cysteine or proline is substituted for (or by) any 
otiier residue; (c) a residue having an electropositive side chain (eg., lysyl, arginyl, or histadyl) is 
substituted for (or by) an electronegative residue (eg;, glutamyl or aspartyl); or (d> a residue having a 
bulky side chain (e^., phenylalanine) is substituted for (or hy) one lacking a side chain (eg., 

IS glycine). 

Variant taxoid Sos-hydroxylase encoding sequences may be produced by standard DNA 
mutagenesis techniques. In one specific, non-limiting, embodiment, M13 primer mutagenesis is 
perfomied. Details of these techniques are provided in Sambrook et al Qn Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 1989), Ch. 15. By the use of 

20 such techniques, variants may be created that differ in minor ways from the taxoid 5a-hydroxylase 
sequences disclosed. In one enibodiment, DNA molecules and nucleotide sequences that are 
derivatives of those specifically disclosed herein, and which differ fiK>m tiiose disclosed by the 
deletion, addition, or substitution of nucleotides while still encoding a protein tiiat has at least 65% 
sequence identity with die taxoid 5ofr-hydroxylase encoding sequence disclosed (SEQ ID NO: 1), are 

25 comprehended by this disclosure. In other embodiments, more closely related nucleic acid molecules 
that share at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 
98% nucleotide sequence identity with the disclosed taxoid 5(x-hydroxylase sequences are 
comprehended by this disclosure. Alternatively, related nucleic acid molecules can have no more 
than 3, 5, 10, 20, 50, 75, or 100 nucleic acid changes compared to S£Q ID NO: 1. In one 

30 embodiment, such variants may differ from the disclosed sequences by alteration of the coding region 
to fit die codon usage bias of the particular organism into which die molecule is to be introduced. 

In otiier embodiments, the coding region may be altered by taking advantage of the 
degeneracy of the genetic code to alter the coding sequence such tiiat, i?i4)ile the nucleotide sequence 
is substantially altered, it nevertheless encodes a protein having an amino acid sequence substantially 

35 similar to the disclosed taxoid 5oi-hydroxylase protein sequences. For example, because of the 

degeneracy of the genetic code, four nucleotide codon triplets - (OCT, GCG, GCC and GCA) - code 
for alanine. The coding sequence of any specific alanine residue within the taxoid 5o^hydroxylase 
protein, therefore, could be changed to any of these alternative codons without affecting the amino 
acid con^osition or characteristics of die encoded protein. Based upon the degeneracy of die genetic 
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code, variant DNA molecules may be derived from die cDNA and gene sequences disclosed herein 
using standard DNA nmtagenesis techniques, as described above, or by synthesis of DNA sequences. 
Thus, this disclosure also encompasses nucleic acid sequences that encode a taxoid Sci-hydroxylase 
protein, but which vary from the disclosed nucleic acid sequences by virtue of the degeneracy of the 
S genetic code* 

In one embodiment, variants of the taxoid Sof-hydroxylase protein may also be defined in 
terms of their sequence identity with the prototype taxoid 5(x-hydroxylase protein. As described 
above, taxoid Sof-hydroxytese proteins share at least 70%, at least 75%, at least 80%, at least 85%, at 
least 90%, at least 95%, or at least 98% amino acid sequence identity with the taxoid 5c^hydroxylase 

10 protein (SEQ ID NO: 2). Alternatively, variants of the taxoid So-hydroxylase protein can have no 

more than 3, S, 10, IS, 20, 25, 30, 40, or 50 amino acid changes conqpared to SEQ ID NO: 2. Nucleic 
acid sequences that encode such proteins/fragments readily may be determined sirnply by applying 
flie genetic code to the amino acid sequence of a taxoid 5Qs-hydroxylase protein or fragment, and such 
nucleic acid molecules may readily be produced by assembling oligonucleotides corresponding to 

1 5 portions of the sequence. 

Nucleic acid molecules that are derived from the taxoid Sc^hydroxylase cDNA imcleic acid 
sequences include molecules that hybridize under low stringency, high stringency, or very high 
stringency conditions to the disclosed prototypical taxoid 5o»-hydroxylase nucleic acid molecules, and 
fragments thereof. 

20 Taxoid 5oi-hydroxylase nucleic acid encoding molecules (including the cDNA shown in 

SEQ ID NO: 1, and nucleic acids con^irising this sequence), and orthologs and homologs of these 
sequences, may be incorporated into transformation or e3q)ression vectors. 

VL Iniroducdon ofOxygmases into Plants or Plant Cells 

25 A nucleic acid molecule (such as a cDNA or gene) encoding taxoid 5a-hydroxylase may be 

incorporated into any organism (intact plant, animal, ndicrobe, etc.) or cell or tissue culture system 
(such as, suspension cell culture, callus cell culture, or immobilized cell culture) for any useful 
purpose known to those of ordinary skill in the art, including, without limitation, (i) production of 
taxoid 5a-hydroxylase, (ii) synthesis of 5a-hydroxylated taxoids, such as taxadien-5a-ol; 

30 (iii) enhancement of the rate of production and/or the absolute amount of one or more taxoids derived 
from 5a-hydroxylated taxoids, such as taxadien-5a-ol; (iv) enhancement of the rate of production 
and/or the absolute amount of paclitaxel or paclitaxel intermediates or derivatives. 

In one exnbodiment, a disclosed 5a-hydroxylase nucleic acid molecule is introduced into a 
plant or plant cell, for exaxiq)le, a gymnosperm species (such as, a Taxus species). Oymnosperms are 

35 a useful e?q>ression system, at least, because of (i) con^)atible codon usage for high translational 

efficiency; (ii) recognition of the encoded preprotein by die plastid iiiq>ort system; (iii) high fidelity 
in proteolytic processing by the plastids to die mature enzyme fom^ and (iv) efficient protein-protein 
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interaction with upstream and dovmstream enzymes of the paclitaxel pathway for most efficient 
channeling of metabolites. 

After a cDNA (or gene) encoding a protein involved in the determination of a particular 
plant characteristic has been isolated, standard techniques may be used to e^qpress the cDNA in 

S transgenic plants in order to modify the particuhur phmt characteristic. The basic approach is to clone 
the cDNA into an expression vector, such that the cDNA is operably linked to control sequences 
(eg., a promoter), which direct expression of the cDNA in plant cells. The transformation vector is 
introduced into plant cells by any of various techniques (eg., electroporation), and progeny plants 
containing the introduced cDNA are selected. Preferably all or part of the transformation vector 

10 stably integrates into the genome of the plant cell. That part of the transformation vector diat 

integrates into the plant cell and that contains the introduced cDNA and associated sequences for 
controlling esqpression (die introduced *transgene*^ may be referred to as the recombinant esqxression 
cassette. 

Selection ofprogeny plants containiiig the introduced transgeneinay be xnade based • 

IS the detection of an altered phenotype. Such a phenotype may result directly from the cDNA cloned 
into the transformation vector or may be manifest as enhanced resistance to a chemical agent (such as 
an antibiotic) as a result of the inclusion of a dominant selectable marker gene incorporated into the 
transfomiation vector. 

Successful exanq>les of the modification of plant characteristics by transformation with 

20 cloned cDNA sequences are replete in the technical and scientific literature. Selected examples that 
serve to illustrate die knowledge in this field of technology include, without limitetion, U.S. Patent 
No. 4,459,355 ('Method for Informing Plant Cells"); U.S. Patent No. 5,571,706 CTlant Virus 
Resistance Gene and Mediods**); U.S. Patent No. 5,677,175 CTlant Pa&ogen Induced Proteins'*); 
U.S. Patent No. 5,510,471 ^Chimeric Gene for the Transformation of Plants'^; U.S. Patent No. 

25 5,750,386 ('Taihogen-Resistant Transgenic Plants"); U.S. Patent No. 5,597,945 ('Tlants GeneticaUy 
Enhanced for Disease Resistance"); U.S. Patent No. 5,589,615 CTrocess for the Production of 
Transgenic Plants with Increased Nutritional Value Via the £?q)ression of Modified 2S Storage 
Albumins"); U.S. Patent No. 5,750,871 ('Transfonnation and Foreign Gene Expression in Brassica 
Species"); U.S. Patent No. 5,268,526 COverexpression of Phytocbrome in Transgenic Plants"); U.S. 

30 Patent No. 5,262,3 16 ("Genetically Transformed Pepper Phmts and Methods for their Production"); 
U.S. Patent No. 5,569,831 CrTransgenic Tomato Plants with Altered Polygalacturonase Isoforms"); 
U.S. Patent No. 5,932,782 CTlant Transformation Metiiod Usmg Agrobacterium Species Adhered to 
Microprojectiles"); and U.S. Patent No. 6,759,573 CMetiiod to Enhance Agrobacterium-Mediated 
Transformation of Plants"). 

35 These exanq)les include descriptions of transformation vector selection, transformation 

techniques, and the constmction of constructs designed to over-e^qjress the introduced cDNA. In 
light of the foregoing and the provision herein of the oxygenase amino acid sequences and nucleic 
acid sequences, it is thus apparent that one of ordinary skill in the art will be able to introduce the 
cDNAs, or homologous or derivative forms of these molecules, into plants in order to produce plants 



V/O 2005/010166 



PCT/US2004/023656 



.28- 

having enhanced oxygenase activity. Furthermore, the expression of one or more oxygenases in 
plants may give rise to plants having increased production of paclitaxel and related conqpounds. 
A. Vector Construction, Choice of Promoters 

A number of recombinant vectors suitable for stable transfection of plant cells or for die 

S establishment of transgenic plants have been described, including those described in Weissbach and 
Wcissbach, Methods for Plant Molecular Biology^ Academic Press, 1989; and QeMnetaL, Plant 
and Molecular Biology Manual^ Kluwer Academic Publishers, 1990. Typically, plant transformation 
vectors include one or more cloned plant genes (or cDNAs) under the transcriptional control of 5*- 
and 3*-regulatory sequences and a dominant selectable marker. Such plant transformation vectors 

10 typically also contain a promoter regulatory region a regulatory region controlling inducible or 
constitutive, envinrnmentally or developmentally regulated, or cell- or tissue-specific expression), a 
transcription initiation start site, a ribosome binding site, an RNA processing signal, a transcription 
termination site, and/or a polyadenylation signal.' 

Exan^les of constitutive plant promoters that may be useful for expressing the cDNA 

IS include: the cauliflower nK>saic vims (CaMV) 35S promoter, which confers constitutive, high-level 
expression in most plant tissues (see, e,g., Odel et al. Nature, 313:810, 1985; Dekeyser et aL, Plant 
Cell, 2:591, 1990; Terada and Shimamoto, MoL Gen. Genet., 220:389, 1990; Benfey and Chua, 
Science, 250:959-966, 1990); die nopaline synttiase promoter (An et al. Plant Physiol^ 88:547, 
1988); and the octopine synthase promoter OFromm et al. Plant Cell, 1 :977, 1989). 

20 Agrobacterium-xaodiiated transformation of Taxus ^edes has been accon^lished, and die resulting 
callus cultures have been shown to produce paclitaxel (Han et al. Plant Science, 95: 187-196, 1994). 
Therefore, it is likely that incorporation of one or more of die described oxygenases under the 
influence of a strong promoter (like CaMV promoter) would increase production yields of paclitaxel 
and related taxoids in such transformed cells. 

25 A variety of plant gene promoters that are regulated in response to environmental, hormonal, 

chemical, and/or developmental signals also can be used for expression of the cDNA in plant cells, 
including promoters regulated by: (a) heat (Callis et al. Plant Physiol, 88:965, 1988; Ainley, et al. 
Plant Mol Biol, IIAZ-IZ, 1993; and Gihnartin et al. Plant Cell, 4:839-949, 1992); (b) light {e.g., 
die pea rbcS-3A promoter, Kuhlemeier et al.. Plant Cell, 1 :471, 1989, and the maize xbcS promoter, 

30 Schafber and Sheen, Plant Cell, 3:997, 1991); (c) hormones, such as abscisic acid (Marcotte et al. 
Plant Cell, 1 :969, 1989); (d) wounding (eg., wuni, Siebertz et al. Plant Cell, 1 :961, 1989); and (e) 
chemicals such as methyl jasmonate or salicylic acid (see also Gatz et al, Ann, Rev. Plant Physiol 
Plant Mol Biol, 48:9-108, 1997). 

Alternatively, tissue-specific (root, leaf, flower, and seed, for example) promoters 

35 (Carpenter et al. Plant Cell, 4:557-571, 1992; Denis et al. Plant Physiol, 101:1295-1304, 1993; 

Opperman et al. Science, 263:221-223, 1993; Stockhause et al. Plant Cell, 9:479-489, 1997; Roshal 
etal,EMBOJ., 6:1155, 1987; Schemthaner a/.,iS'MBOy., 7:1249, 1988; andBustos etal,Plant 
Cell, 1:839, 1989) can be fiised to die coding sequence to obtain a particular expression in respective 
orgsns. 
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Alternatively, the native oxygenase gene promoters may be utilized. With the provision 
herein of the oxygenase nucleic acid sequences, one of skill in the art will appreciate that standard 
molecular biology techniques can be used to determine the corresponding promoter sequences* One 
of skill in the art also will appreciate that less than the entire promoter sequence may be used in order 
S to obtain efifective promoter activity. The determination of whetiier a particular region of this 
sequence confers effective promoter activity may be ascertained readily by operably linking the 
selected sequence region to an oxygenase cDNA (in conjunction with suitable 3* regulatory region, 
such as the NOS 3' regulatory region as discussed below) and determining whether the oxygenase is 
expressed. 

10 Plant transformation vectors also may include RNA processing signals, for example, introns, 

that may be positioned upstream or downstream of the ORF sequence in die transgene. In addition, 
ttie expression vectors also may include additional regulatory sequences fiom the 3*-untranslated 
region of plant genes, eg:., a 3'-terminator region, to increase mRNA stability of tiie mRNA, such as 
the Pl-n terminator region of potato or the octopine or nopaline syntiiase (NOS) 3'-terminator 

IS regions. The native oxygenase gene 3*-regulatory sequence also may be en^Ioyed 

As noted above, plant transformation vectors also may include dominant selectable marker 
genes to allow for the ready selection of transformants. Such genes include those encoding antibiotic 
resistance genes (e.g., resistance to hygromycin, kanamycin, bleomycin, G418, streptomycin, or 
spectinomycin) and herbicide resistance genes (e.g., phosphinothricin acetyloxygenase). 

20 B. Arrangement of Taxol Oxygenase Sequence f n a Vector 

Tbie particular arrangement of die oxygenase sequence in the transformation vector is 
selected accordiqg to the type of eiqnession of die sequence that is desired. In most instances, 
enhanced oxygenase activity is desired, and die oxygenase ORF is opembly linked to a constitutive 
high-level promoter such as the CaMV 35S promoter. As noted above, enhanced oxygenase activity 

25 also may be achieved by introducing into a plant a transformation vector containing a variant form of 
the oxygenase cDNA or gene, for exan^le a form that varies from the exact nucleotide sequence of 
the oxygenase ORF, but that encodes a protein retaining an oxygenase biological activity. 
C. Transformation and Regeneration Techniques 

Transformation and regeneration of a wide variety of plant species, including gymnosperms, 
30 angiosperms, monocots and dicots are now routine (see, e.g.^ Click and Thoii^)son, eds.. Methods in 
Plant Molecular Biology, CRC Press, Boca Raton, Fla., 1993), and the appropriate transformation 
technique can be determined by the practitioner. The choice of method varies with die type of plant 
to be transformed; those skilled in the art will recognize the suitability of particular methods for given 
plant types. Suitable methods may include, but are not limited to: electroporation of plant protoplasts 
35 (eg., Rhodes et al. Science^ 240(4849):204-207, 1988); liposome-mediated transformation; 

polyethylene glycol (PEG>mediated transformation (eg., Lyznik et a/., Plant Mol Bioly 13:151-161, 
1989); transformation using viruses (eg., Brisson et a/., Nature^ 3 10:51 1-514, 1984); microinjection 
of plant cells (eg., de la Pena et al. Nature, 325:274-276, 1987); nricro-projectile bombardment of 
plant cells (Klein et a/.. Plant PhysioLy 91:440-444, 1989; Boynton et al. Science, 240(4858): 1534- 
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1 538,1988); vacuum infiltiation; and Agrobacterium tumefaciens (AT)-inediated tiaiisfonnation. 
Exemplar procedures for transfoimiiig and regenerating plants are described, for instance, in the 
patent documents listed at the beginning of this section. Additionally, plant transformation strategies 
and techniques are reviewed by Birch {Ann. Rev. Plant Phys. Plant MoL BioU 48:297, 1997), and 
S Forester et al (Exp. Agric, 33:15-33, 1997). 

In particular embodiments, transformation of Taxus species can be achieved, for example, 
by employing the methods of Han et al. (Plant Science^ 95:187-196, 1994). 

D. Selection of Transformed Plants 

Following transformation and regeneration of plants with the transformation vector, 
10 transformed plants or cells can be selected using a selectable marker incorporated into tfie 

transforaoation vector. In some examples, such a marker confeis antibiotic resistance on the seedlings 
of tnmsformed plants, and selection of transformants can be accon^^lished by esqposing the seedlings 
to appropriate concentratioiis of tiie antibiotic. For instance, a commonly used selectable marker 
gene is neomycin phosphotransferase n (NPT II), which confers resistance to the antibiotic, 
IS kanamycin. Anotiier selectable marker gene which can be enq>loyed is the gene which confers 
resistance to the herbicide glufosinate (Basta). A screenable gene commonly used is the 0- 
glucuronidase gene (GUS). The presence of this gene is characterized using a histochemical reaction 
in which a sample of putatively transformed cells is treated with a GUS assay solution. After an 
(^rqpriate incubation, the cells containing Ae transformation vector (which includes tibie GUS gene) 
20 turn blue. 

After transfcmned plants are selected and grown to maturity, they can be assayed using die 
methods described herein to assess production levels of paclitaxel and odier taxoids. 

VLL Production of Recombinant Toxoid Oxygenase in Heterologous Expression Systems 
25 Various commonly known systems are available for heterologous expression of the 

disclosed 5a-hydroxylase nucleic acid molecules to yield the encoded proteins, including^ eukaxyotic 
and prokaiyotic e:q>ression systems. In some exanq)les, eukaxyotic expression systems are used to 
facilitate posttranslational modification of the expressed protein and/or to direct ^e expressed protein 
to a desired cellular compartment 
30 Metiiods of e3q>rossing proteins in heterologous e:q>ression systems are well known in die 

art Typically, a nucleic acid molecule encoding all or part of the protein of interest, such as a 
5a-hydroxylase, is obtained using metiiiods such as those described herein. The protein-encoding 
nucleic acid sequence is cloned into an expression vector that is suitable for the particular host cell of 
interest using standard recombinant DNA procedures. Expression vectors include (among other 
35 elements) regulatory sequences (e.^., promoters) that can be operably linked to the desired 

protein-encoding nucleic acid molecule to cause the expression of such nucleic acid molecule in the 
host cell. Together, the regulatory sequences and the protein-encoding nucleic acid sequence are an 
"expression cassette." Expression vectors may also include an origin of replication, marker genes 
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that provide phenotypic selection in tiansfonned cells* one or more other promoters, and a polylinker 
region containing several restriction sites for insertion of heterologous nucleic acid sequences. 

Expression vectors useful for e?qpression of heterologous protein(s) in a nniltitude of host 
cells are well known in the art, and some specific exanq>lesaro provided herein. The host cell is 
5 transfected with (or infected with a vims containing) the expression vector using any method suitable 
for the particular host cell. Such transfection methods are also well known in the art and non-limiting 
exemplar methods are described herein. The transfected (also called^ transformed) host cell is 
capable of expressing the protein encoded by the corresponding nucleic acid sequence in the 
expression cassette. Transient or stable transfection of the host cell with one or more expression 

10 vectors is contemplated by the present disclosure. 

The cloned CTqpression vector encoding one or moro of the disclosed oxygenases may be 
transformed into any of various cell types for expression of the cloned nucleotide. Many different 
types of cells may be used to e?^»ress modified nucleic acid molecules. Examples include cells of 
yeasts, fbngi, insects, mammals, and plants, including primary cells and immortal cell lines. For 

15 instance, common mammalian cells that could be used include HeLa cells, SW-527 cells (ATCC 
deposit #7940), WISH cells (ATCC deposit #CCL-25), Daudi cells (ATCC deposit #CCL.213), 
Mandin-Darby bovine kidney cells (ATCC deposit #CCL-22) and Chinese hamster ovary (CHO) 
cells (ATCC deposit #CRL-2092). Common yeast cells include Pichia pastoris (ATCC deposit 
#201 178) and Saccharomyces cerevisiae (ATCC deposit #46024). Insect cells include cells from 

20 Drosophila mdanogaster (ATCC deposit #CRL-10191), tiie cotton bollworm (ATCC deposit #CRL- 
9281), and Ttichoplmia ni egg cell homoflagellates. Fish cells that may be used include those from 
raiiibow trout (ATCC deposit #CLL.55), sahnon (ATCC deposit #CRL-1681), and zebrafidi (ATCC 
deposit #CRL-2147). Ainphibian cells that may be used include those of die bullfrog, Rma 
catesbelana (ATCC deposit #CLL-41). Reptile cells that may be used include tiiose from Russell's 

25 viper (ATCC deposit #CCL-140). Plant cells that could be used include Chlamydomonas cells 
(ATCC deposit #30485), Arabidopsis cells (ATCC deposit #54069), tomato plant cells (ATCC 
deposit #54003) and Taxus cells (including, e.g,, cells from T. cuspidata, T. brevifoliay T. canadensis, 
71 baccata, T. globosa, T.fioridana, T, wallichiana, T. media and T chinensis). Many of these cell 
types are commonly used and are available from die ATCC as well as from commercial suppliers 

30 such as Pharnvicia (Uppsala, Sweden), and Invitrogen. 

E3q>ressed protein may be accumulated witiiin a cell or may be secreted from die cell. Such 
esqffessed protein may tiien be collected and purified. This protein may be characterized for activity 
and stability and may be used to practice any of the various metiiods disclosed heroin. Further details 
of some specific embodiments are discussed below. 

35 A. Yeast 

Various yeast strains and yeast-derived vectors are used commonly for the egression of 
heterologous proteins. For instance, Pichia pastoris expression systems, obtained from Invitrogen 
(Carlsbad, California), may be used to express the disclosed P450 oxygenases, such as a taxoid 
5Q^hydroxylase. Such systems include suitable Pichia pastoris strains, vectors, reagents. 
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transfonnants, sequencing primers, and media. Available strains include KM71H (a prototrophic 
strain), SMDl 168H (a prototrophic strain), and SMDl 168 (a pq>4 mutant strain) (Invitrogen Product 
Catalogue, 1998, Invitrogen, Carlsbad CA). 

Saccharomyces cerevisiae^ is another yeast that is commonly used in heterologous 

S expression systems. The pksmid YRp7 (Stinchcomb ei a/.. Nature^ 282:39, 1979; Kingsman ei aL^ 
Gene,7il4l, 1979;T8cben9ere/a/., Gene, 10:157, 1980) is commonly used as an expression vector 
in Saccharomyces. This plasmid contains the trpl gene that provides a selection marker for a mutant 
strain of yeast lacking the ability to grow in tryptophan, such as strains ATCC No. 44,076 and PEP4- 
1 (Jones, Genetics, 85:12, 1977). The presence of the txpl lesion as a characteristic of the yeast host 

10 cell genome then provides an effective environment for detecting transformation by growth in the 
absence of tryptophan* 

Yeast host cells can be transformed using the polyethylene glycol method, as described by 
Hinnen (Proa. Natl Acad. Set USA, 75:1929, 1978). Additional yeast transformation protocols are 
set forth in Gietz et al (Nucl. Acids Res., 20(17):1425, 1992) and Reeves et al {FEMS, 99(2-3):193- 
15 197, 1992). 

Suitable promoting sequences in yeast vectors include the promoters for 3-phosphoglycerate 
kinase (Hitzeman et al, J. Biol Chem,, 255:2073, 1980) or other glycolytic enzymes (Hess et al,, J. 
Adv. Enzyme Reg., 7:149, 1968; Holland et al. Biochemistry, 17:4900, 1978), such as enolase, 
gilyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, 

20 phosphofructokinase, gIucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, 
triosephosphate isomerase, phosphoghicose isomerase, and glucokinase. In ihe construction of 
suitable e^qsression vectors, the termination sequences associated with diese genes are also ligated 
into the expression vector 3' of the sequence desired to be expressed to provide polyadenylation of 
the mRNA and termination. Other promoters that have the additional advantage of transcription 

25 controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome 
C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and die 
aforementioned glyceraldehyde-3-phosphate dehydrogenase, and enzymes responsible for maltose 
and galactose utilization. Any plasnoid vector containing yeast-^on^atible promoter, origin of 
replication and termination sequences is suitable. 

30 Non-yeast eukaryotic vectors may be used wi& equal facility for expression of proteins 

encoded by modified nucleotides according to the invention. Mammalian vector/host cell systems 
containing genetic and cellular control elements capable of carrying out transcription, translation, and 
post-translational modification are well known in the art. Bxamples of such systems are die well 
known baculovirus system, the ecdysone-inducible expression system that uses regulatory elements 

35 ftom Drosophila melanogaster to allow control of gene expression, and the sindbis viral expression 
system that allows high-level e>q)ression in a variety of mammalian cell lines, all of which are 
available fiom Invitrogen (Carlsbad, California). 
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B. Baculovirus-infected Insect Celb 

Another lepiesentative eukaiyotic expression system involves the recombinant baculoviius, 
Autographa califomica nuclear pol^edrosis viius (AcNPV; Summers and Smith, A Manual of 
Methods for Baculovirus Vectors and Insect Cell Culture Procedures, 1986;Luckowera/., 
5 BiotechnoLf 6:47-SS, 1987). Infection of insect cells (such as cells of the species Spodoptmi 
frugiperda) with the recombinant baculoviruses results in the expression taxoid 5a-hydroxyIase 
protein in the insect cells. Baculoviruses do not infect humans and can therefore be safely handled in 
laige quantities. 

A baculovirus e7q>ression vector is prepared as previously described using standard 
10 nx>lecular biology techniques. The vector may con^rise the polyhedron gene promoter region of a 
baculovirus, fhe baculovirus flanking sequences necessary for proper crossover during recombination 
(the flankuig sequences comprise about 200-300 base pairs adjacent to the promoter sequence) and a 
bacterial origin of replication which permits tibe construct to replicate in bacteria. In particular 
examples, the vector is constructed so fliat (i) die taxoid Sa-hydroxylase protein-encoding nucleic 
IS add sequence is operably linked to the polyhedron gene promoter (collectively, the "e:q)ression 
cassette*^ and (ii) tibe e^qiression cassette is flanked by die above-described baculovirus flanking 
sequences. 

Insect host cells (such as, Spodoptera frugiperda cells)^ are infected with a recombinant 
baculovirus and cultured under conditions allowing expression of the baculovirus-encoded taxoid 
20 Sa-hydroxylase. The expressed oxygenase may, if desired, be extracted from die insect cells using 
methods known in the art 

C. Manunalian 

Mammalian host cells may also be used for heterologous expression of a disclosed 
oxygenase, such as a taxoid Sa-hydroxylase. Exan^les of suitable mammalian cell Imes include, 

2S widiout limitation, monkey kidney CVI Ime transformed by SV40 (COS-7, ATCC CRL 16S1); 

human embryonic kidney line 293S (Graham et aL, J. Gen. Virol, 36:S9, 1977); baby hamster kidney 
cells (BHK, ATCC CCL 10); Chinese hamster ovary cells (Urlab and Chasin, Proc. Natl Acad. Sci 
USA, 77:4216, 1980); mouse Sertoli cells (TM4, Mather, Biol Reprod., 23:243, 1980); monkey 
kidney cells (CVI.76, ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC CRL- 

30 1587); human cervical carcinonoa cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC 
CCL 34); bufifelo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 
75); human liver cells (Hep G2, HE 8065); mouse mammary tumor cells (MMT 060562, ATCC CCL 
5 1); rat hqmtoma cells (HTC, MI.54, Baumann et al, J. Cell Biol, 85:1, 1980); and TRI cells 
(Madier et al. Annals N. Y. Acad. Scl, 383:44, 1982). E?qiression vectors for these cells ordinarily 

35 include (if necessary) DNA sequences for an origin of replication, a promoter located in front of the 
gene to be expressed, a ribosome binding site, an RNA splice site, a polyadenylation site, and/or a 
transcription terminator site. 
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Pioniotersusedmnuuniiuaianexpi^ssionvec^ Such viral 

promoteis may be derived from polyoma virus, adenovirus 2, and simian virus 40 (SV40). The SV40 
virus contains two promoters ttmt are termed the early and hte promoters. These promoters are 
useful because they are both easily obtained from the virus as one nucleic acid fragment that also 

5 contains the viral origin of replication (Fiers et al^ Nature^ 273: 1 13, 1 978). Smaller or laiger SV40 
DNA fragments may also be used, provided they contain the approximately 2S0*bp sequence 
extending from the Hindm site toward the BgU site located in the viral origin of replication. 
Alternatively, promoters that are naturally associated with the foreign gene (homologous promoters) 
may be used provided that they are compatible with the host cell line selected for transformation. 

10 An origm of replication may be obtained from an exogenous source, such as SV40 or other 

vkus (e.^., polyoma virus, adenovirus, VSV, BPV) and inserted into the esqnression vector. 
Alternatively, be origin of r^lication may be provided by the host cell chromosomal replication 
mechanism. 

D. Prokaryotes 

IS Prokaryotes may also be used as host ceUs. Prokaryotic expression systems are useful for 

(among other things) rapid production of large amounts of plasmid DNA, for production of 
single-stranded DNA tenxplates used for site-directed mutagenesis, for screening many mutants 
simultaneously, and for DNA sequencing of the mutants generated. Suitable prokaryotic host cells 
mclude, without limitation, E. coli K12 strain 94 (ATCC No. 31,446), E. coli strain W3 1 10 (ATCX: 

20 No. 27,325), E. coli X1776 (ATCC No. 31,537), and E. coU B; however many oflier strains of £. coli, 
such as HBlOl, JMlOl, NM522, NM538, NM539, and many oflier species and genera of prokaryotes 
including bacilli such as Bacillus subtilis, otiier enterobacteriaceae, such as Salmonella typhimurium 
or Serratia marcesans, and various Pseudomonas species may all be used as hosts. 

Prokaryotic host cells or other host cells with rigid cell walls may be transformed using any 

25 method known in the art, including, for exanq)le, calcium phosphate precipitation, or electroporation. 
Representative prokaryote transformation techniques are described in Dower {Genetic Engineering, 
Principles and Methods, 12:275-296, Plenum Publishing Coip., 1990) and Hanahan et ah (Meth. 
EnzymoL, 204:63, 1991). 

Plasmids typically used for transformation otE, coli include, widiout limitation, pBR322, 

30 pUClS, pUC19, pUCIlS, pUCl 19, Bluescrq)t M13 and derivatives thereof. Numerous such 

plasmids are commercially available and are well known in the art Representative promoters used in 
prokaryotic vectors include the ^-lactamase (penicillinase) and lactose promoter systems (Chang et 
al. Nature, 375:615, 1978; Itakura et al.. Science, 198:1056, 1977; Qoeddel etaL, Nature, 281:544, 
1979), a tryptophan (trp) promoter system (Qoeddel et al, Nucl Acids Res,, 8:4057, 1980), and the 

35 alkaline phosphatase system. 

E. Heterologous Protein Trafficking 

Trafficking sequences from plants, animals and microbes can be employed to direct the 
expression of a disclosed oxygenase, such as a 5a-hydroxylase, to the cytoplasm, endoplasmic 
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reticuluin, mitochondria or other cellular compartmeiit, or to target the protein for e^qport to the 
medium. 

Many eukaryotic proteins contain an endogenous signal sequences. The nucleic add 
sequence encoding a signal sequence may be obtained as a restriction fragment fiom any gene 

5 encoding a protein with a signal sequence. By ligating DNA encoding a signal sequence to the S* end 
of the DNA encoding a protein of interest, the resultant chimeric protein can be directed to the 
destination conveyed by the signal sequence. 

The signal sequences of several eukaryotic genes are known, including, for exanople, human 
growth hormone, proinsulin, and proalbumin (see, eg., Stryer, Biochemistry, Third Edition, 

10 W.H. Freeman and Company, New York, N.Y., p. 769, 1988), and can be used as signal sequences in 
appropriate eukaryotic host cells. Yeast signal sequences, such as acid phosphatase (Arima et al., 
NucL Acids Res., 1 1 : 1657, 1983), a-&ctor, alkaline phosphatase and invertase, may be used to direct 
secretion from yeast host cells. Prokaryotic signal sequences fix>m genes encoding, for example, 
LamB or OmpF (Wong et al. Gene, 68:193, 1988), MalE, PhoA, or P-lactamase, as well as oflier 

IS genes, may be used to target proteins fiom prokaryotic cells into the culture medimrL 

Vm. Production of an Antibody to a Toxoid Sa-hydroxyiase Protein 

Monoclonal or polyclonal antibodies may be produced to either the normal taxoid 
5of-hydroxylase protein or variants of ^s protein. In one embodiment, antibodies raised against the 

20 taxoid 5of-hydroxylase protein would specifically detect the taxoid S(X-hydroxylase protein. That is, 
such antibodies would recognize and bind the taxoid Soe-hydroxylase protein, or fiagments tfiereof, 
and would not substantiaUy recognize or bind to odier proteins fouiid in raxuff Insome 
embodiments, antibodies against flie Taxus cuspidata taxoid Sc^hydroxylase protein may recognize 
taxoid Soe-hydroxylase fiom otbsr pachtaxel-producing species (eg., Taxomyces andreanae), and vice 

25 versa. Antibodies to tfie disclosed oxygenase enzymes, and fiagments thereof, may be also usefid for 
purification of the enzymes. 

The determination that an antibody specifically binds to an antigen is made by any one of a 
number of standard immunoassay methods; for instance, Western blotting (see, Sambrook et al. 
(eds.). Molecular Cloning: A Laboratory Manual^ 2nd ed., vols. 1-3, Cold Spring Harbor Laboratory 

30 Press, Cold Spring Harbor, NY, 1989). To determine tiiat a given antibody preparation (such as a 
preparation produced in a mouse against SEQ ID NO: 2) specifically detects the oxygenase by 
Western blotting, total cellular protein is extracted fiom cells and electrophoresed on an SDS* 
polyacrylamide gel. The proteins are electrophoretically transferred to a membrane (for example, 
nitrocellulose), and die antibody prqiaration is incubated with the membrane. After washing the 

3S membrane to remove non-specifically bound antibodies, the presence of specifically bound antibodies 
is detected by the use of a detector molecule (such as, an anti-mouse antibody conjugated to an 
enzyme such as alkaline phosphatase). Antibodies that ^ecifically detect an oxygenase will be 
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shown, by this technique, to bind substantially only the oxygenase band (having a position on the gel 
detennined by the molecular weight of the oxygenase). 

Substantially pure oxygenase suitable for use as an inununogen can be isolated from 
transfected cells, transformed cells, or fiom wfld^type cells. Concentration of protein in die final 

5 preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a 
few micrograms per milliliter. Alternatively, peptide fragments of an oxygenase may be utilized as 
immunogens. Such fragments may be synthesized chemically using standard methods, or may be 
obtained by cleavage of the whole oxygenase enzyme followed by purification of the desired peptide 
fi:agments. Peptides as short as three or four amino acids in length are immunogenic when presented 

10 to an immune system in the context of a Major Histocon^atibility Complex (MHC) molecule, such 
as MHC class I or MHC class n. Accordingly, peptides comprising at least 3 and preferably at least 
4, S, 6 or more consecutive amino acids of the disclosed oxygenase amino acid sequences may be 
employed as inomunogens for producing antibodies. 

Because naturally occuiring q>itopes on proteins firequently comprise amino acid residues 

IS that are not adjacently arranged in the peptide when the peptide sequence is viewed as a linear 

molecule, it may be advantageous to utilize longer peptide fi:agments firom the oxygenase amino acid 
sequences for producing antibodies. Thus, for example, peptides that comprise at least 10, 15, 20, 25, 
or 30 consecutive amino acid residues of the amino acid sequence may be employed. Monoclonal or 
polyclonal antibodies to the intact oxygenase, or peptide fi:agments thereof may be prepared as 

20 described below. 

A. Monoclonal Antibody Production by Hybrldoma Fusion 

Monoclonal antibodies can be prepared firom murine hybridomas according to the classical 
method of Kohler and Milstein (Nature, 256:495-497, 1975) or derivative methods thereof. In one 
specific, non-limiting embodiment, a mouse is repetitively inoculated with a few micrograms of the 

25 selected protein over a period of a few weeks. The mouse is fiien sacrificed, and the antibody- 
producing cells of the spleen isolated. The spleen cells are fused with mouse myeloma cells using 
polyethylene glycol, and the excess, non-fused, cells destroyed by growth of the system on selective 
media comprising aminopterin (HAT media). Successfully fused cells are diluted and aliquots of the 
dilution placed in wells of a microtiter plate, where growth of the culture is continued. Antibody- 

30 producing clones are identified by detection of antibody in die supernatant fluid of the wells by 

immunoassay procedures, such as ELISA, as originally described by Engvall (EnzymoL, 70(A):419- 
439, 1980), and derivative mediods diereof. Selected positive clones can be e?q)anded and their 
monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody 
production are described in Harlow and Lane (Antibodies, A Laboratory Mamtal^ Cold Spring Harbor 

35 Laboratory Press, New York, 1988). 

B. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogeneous epitopes of a single protein can 
be prepared by immunizing suitable animals with the expressed protein (for instance, expressed using 
a method described herein), which, in one specific, nonrlimiting embodiment, can be modified to 
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t 

enhance inuminogenicity. Effective polyclonal antibody production is affected by many fiictois 
related both to the antigen and the host species. In one embodiment^ small molecules may tend to be 
less inununogenic than others and may require the use of carriers and adjuvant, examples of which 
are known* In another embodiment, host animals may vary in response to site of inoculations and 
5 dose, with either inadequate or excessive doses of antigen resulting in low titer antisera. In one 
specific, non-limiting embodiment, a series of small doses (ng level) of antigen administered at 
multiple intradermal sites may be most reliable. An effective imnnmization protocol for rabbits can 
be found in Vaitukaitis etal (7. Clin. Endocrinol Metab., 33:988-991, 1971). 

In one enabodtment, booster injections will be given at regular intervals, and antiserum 

10 harvested when antibody titer thereof begins to fidl, as determined semi-quantitatively (for exan^>le, 
by double immunodifiRision in agar against known concentrations of the antigen). See, for exan^le, 
Ouchterlony et al. Qn Handbook of Experimental Immunology, Wier, D. (ed.) chapter 19. Blackwell, 
1973). In one specific, non-limiting embodiment the plateau concentration of antibody is usually in 
the range of about 0.1 to 0.2 mg/ml of serum (about 12 fiM). Affinity of the antisera for the antigen 

IS is determined by preparing con^etitive binding curves, as described, for example, by Fisher (Manual 
of Qinical Immunology^ Ch. 42, 1980). 

C. Antibodies Raised against Synthetic Peptides 

A diird approach to raising antibodies against the taxoid So-hydroxylase protein is to use 
synthetic peptides synAesized on a commercially available peptide synthesizer based upon the 
20 predicted amino acid sequence of the taxoid So^hydroxyhise protein. Polyclonal antibodies can be 
generated by injecting such peptides into, for instance, rabbits. 

D. Antibodies Raised by Ii^ action of Taxoid 5a-Hydroxylase Encoding Sequence 
In one embodiment, antibodies may be raised against the taxoid SQS-hydro?^lase protein by 

subcutaneous injection of a recombinant DNA vector tiiat e3q)resses the taxoid 5c»>hydroxylase 
25 protein into laboratory animals, such as mice. In one specific, non-limiting embodiment, delivery of 
the recombinant vector into tiie animals may be achieved using a hand-held form of the Biolistic 
system (Sanford et al. Particulate ScL Technol, Sill-Zl, 1987), as described by Tang et al {Nature, 
356:152-154, 1992). In other embodiments, e:q>ression vectors suitable for this purpose may include 
those that express tfie taxoid Sof-hydroxylase encoding sequence under the transcriptional control of 
30 either the human jS-actin promoter or the cytomegalovirus (CMV) promoter. 

IX. Methods of Using Sc^Hydraxylase 

The creation of reco m b i nan t vectors and transgenic organisms expressing vectors disclosed 
herein are useful for controlling the production of the disclosed oxygenases, such as the 
35 Sof-hydroxylase. These vectors can be used to decrease oxygenase production or to increase 

oxygenase production. Increased production of oxygenase can be achieved by including at least one 
additional oxygenase encoding sequence in the vector. These vectors can be introduced into a host 
cell, thereby altering oxygenase production. In the case of increased production, the resulting 
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oxygenase may be used in In vitro systems* as well as in vivo for increased production of paclitaxel, 
other taxoids, intermediates of the paclitaxel biosynthetic pathway, and other products. 
A. Prodactfon of Paclitaxel or Other Taxold In Vivo 

One attractive alternative to yew harvest and/or paclitaxel semisynthesis is the production of 

5 paclitaxel and taxoids in vivo^ such as in transgenic organisms and/or cell culture (including, for 
example, Taxus cell culture). Cell culture, for example, lends itself to vat fermentation format 
(potentially as a continuous process), a high level of process control, and ease of product isolation 
and purification. This practice further provides the possibility of biochemical/molecular 
manipulation to direct biosynthesis to specific taxoid precursors, modified forms, and derivatives. 

10 In current practice at die small scale, Taxus cell cultures produce about 10-100 mg/L of 

paclitaxel (i^ to 1 gram total taxoids/L) in production runs of about 7-10 days; however, production 
levels are quite variable and not sustainable with time or at scale. Conunercially viable production 
levels of paclitaxel are estimated to be between about 200-400 mg/L and of precursors for semi- 
synthesis in the range of about 400-800 mg/L range. Enhancement of production levels and/or 

15 redirection of taxoid metabolism can be useful to achieve economic viability. Preferably, production 
levels are consistent and reliable. A system that is biochemically manipulable can permit synthesis of 
a range of taxoid derivatives (eg., alternative precursors and second generation drugs). Such a 
system is now enabled by the disclosure of the 5of-hydroxylase protein and nucleic acid sequences. 
This enzyme is believed to catalyze a slow-step in the paclitaxel biosynthetic pathway; thus, alone 

20 and m combination witii other enzymes of die paclitaxel paUiway, die disclosed So-hydroxylase 

protein and xnicleic acid sequences permit molecular genetic manipulation (genetic engineering) of 
cultured cells, such as Taxus cells, to increase yields of paclitaxel and to dkect the pathway to 
desirable taxoid metabolites. 

Production of paclitaxel and related taxoids (such as, taxoid-S-ols, including isomers of 

25 taxadien-5-ol) in vivo can be accomplished by transfecting a host cell, such as one derived firom the 
Taxtis genus, with a vector capable of e^ressing, at least, a disclosed oxygenase (such as, a taxoid 
5of-hydroxylase). Methods of making and using suitable expression vectors and transforming a 
variety of cell types with such vectors have been described above. In certain exaix^les, heterologous 
or homologous oxygenase sequences are placed under die control of a constitutive promoter, or an 

30 inducible promoter; thus, any naturally occurring feedback tiiat might otherwise downregulate 
oxygenase expression under natural conditions will be eliminated. 

In some methods, die host cell does not produce any paclitaxel prior to transfection, in 
which case, particular methods can involve feeding taxoids (such as, paclitaxel intermediates) to the 
cell. In other methods, a host cell will express a detectable amount of paclitaxel prior to transfection 

35 so diat transfection with the expression vector increases die production of paclitaxel in the transfected 
cell. In particular exanq>les of these methods, paclitaxel production in a transfected cell may be 
increased by at least two fold, such as at least four fold, at least 10 fold, at least 20 fold, at least 50 
fold or at least 100 fold. 
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A host cell, which has been (or will be) transfected with a disclosed SO'hydroxylase, may 
also be transfected (usiiig either the same or a different e9q;>ression vector) with nucleic acid 
sequences encoding odier enzymes having activities useful for the biosynthesis of paclitaxel 
including, for example, taxadiene synthase (such as SEQ ID NO: 19 or the protein-encoding portion 
S thereoO> taxadienol acetyl transferase (such as, SEQ ID NO: 21 or the protein-encoding portion 
thereof; also known as TAXI or TAT), 2-debenzoyl-7,13-diacetylbaccatin m-2-O-benzoyl 
transferase (such as, SEQ ID NO: 23 or the protein-encoding portion thereof; also known as TAX2), 
10-deacetyIbaccatin m-10-O-acctyI transferase (such as, SEQ ID NO: 34 or the protein-encoding 
portion thereof; also known as TAX6 or DBAT), taxoid 13-phenylpropanoyltransferase (such as, 

10 SEQ ID NO: 36 or the protein-encoding portion dieieof; also known as TAX7); 3 '-N-debenzoyltaxol 
N-benzoyltransferase (such as, SEQ ID NO: 38 or tihe proteuirencoding portion diereof; also known 
as TAXIO or DBNIBT); or any of a several taxoid oxygenases, including, without limitation, taxoid 
7/3-hydroxylase, taxoid 14/3-hydroxyhise, taxoid 10/^hydroxylase, taxoid 13c0-hydroxyhise, or taxoid 
2of-hydroxylase (such as, one or more of die taxoid-oxygenases set forth in SEQ ID NOs: 3, 5, 7, 9, 

15 1 1, 13, 15, 17 or 40). Variants of each of die foregoing enzymes, which maintain the function of the 
prototype enzyme, would be equally suitable for use in die above-described multi-gene expression 
system. Such variants can be, for instance, at least 70%, at least 80%, at least 90%, or at least 95% 
percent identical to eitiier the nucleic acid sequence encoding, or the amino acid sequence of, the 
prototype enzyme. 

20 Methods and constructs for die introduction of multiple protein-encoding nucleic adds 

sequences (such as, cDNAs) into cells, such as plant ceUs (including, e.g., Taxus cell lines), usmg 
single or multiple transformation event(s) have been described (see, eg., U.S. Pat No. 6,337,431; 
U.S. Pat Pub. No. 20020129400, U.S. Pat Pub. No. 20020059660; de Felipe, Curr. Gene Ther., 
2(3):355-378, 2002). For exanq)le, techniques commonly used for introduction of multiple genes 

25 into cells include: (i) co-transformation with mixed multiple plasmid vectors containing different 
protein-encoding sequences using any transfection method known in die art (e.g., Chen et aL, Nat. 
BiotechfwL, 16:1060-1064, 1998; Ye et al. Science, 287:303-305, 2000); (ii) sequential re- 
tcansformation of die same recq)ient cell (or cell population) with vectors where each vector contains 
one or a few protein-encoding sequences (eg., Lapierre et a/., Plant Physiol, 1 19:153-163, 1999); or 

30 sexual crossing between transgenic organisms carrying different transgenes to recombine the genes to 
a single organism (eg.. Ma et a/.. Science, 268:716-719, 1995); and (iii) linking of multiple genes of 
different sources into the same vector using conventional molecular cloning technology for 
transformation (eg.. Van Engelen et al. Plant MoL Biol, 26:1701-1710, 1994; DanieU and Dhingra, 
Curr. Opin. BiotechnoL, 13:136-141 2001). In particular exanq)les, a multi-gene construct includes a 

35 promoter, nucleic acid sequences encoding two or more proteins, inteins, and transcription 

tennination sequences and, optionally, sequences encoding targeting sequences, or tissue specific 
sequences, such as tissue-specific targeting peptides. 

Cells may be transfected (le., transformed) with one or more constracts useful for die 
es^ression of multiple protein-encoding sequences in a single cell in any manner known in Ifae art or 
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as described berein including, without limitation* Agrobracterium tnmfonnation of plant cells (see, 
for instance, Han a/*, P/on/ 5c/., 9S(2):187-196, 1994). 

B. Production of Paclltazel or Other Tazoids In Vitro 
Currently, paclitaxel is produced by a semisynthetic method described m Hezari and 
5 Crotcau, Planta Medica, 63:291-295, 1997. This method involves extracting KMeacetyl-baccatm 
m, or baccatin HI, intermediates in the paclitaxel biosynthetic pathway, and then finishing the 
production of paclitaxel using chemical techniques. With the provision of a taxoid 5(x-hydroxylase 
herein, it is now possible to utilize this enzyme (and its variants) to hydroxylate taxoids (such as, 
paclitaxel intermediates) to produce taxoid-S-ols (including for exanq>le, taxadien-5-ol isomers). 
10 Such taxoid-S-ols can be used, for example, to &cilitate the production of paclitaxel and related 
taxoids. 

In vitro mediods involve transfection of a host cell witib a vector expiessmg a disclosed 
5a-hydroxylase, as described previously. Following transfection, flie recombinant enzyme may 
hydroxylate available taxoid substrates (including, e.g., taxadiene isomers such as, taxa- 

15 4(5),1 l(12)-diene, and taxa-4(20),l l(12)-diene). Such substrates can be naturally present in the host 
cell (such as, a Taxus cell) or can be administered to the host cell, for exan^le, by adding the 
exogenous substrate to the media bathing the cells. Under these circumstances, 5c^hydroxylation of 
the substrate can occur in vivo (as discussed in the preceding section) and the resultant product, such 
as a taxadien-5-ol, can be extracted and/or purified for further in vitro processing, including the 

20 synthesis of paclitaxel, paclitaxel mtermediates, or odier taxoids. 

In other meOiods, the 5o5-hydroxylase protein can be isolated from transfected cells. The 
isolated protein can, then, be used as a reagent in reactions involving die ScK'hydros^lation of taxoid 
substrates, including taxadiene isomers (such as, taxa-4(5),l l(12)-diene, and taxa-4(20),l 1(12)- 
diene). 

25 

Embodiments of the invention are illustrated by the following non-limiting Exan^les. 

EXAMPLES 
Example 1 

30 HOMOLOGY-BASED CLONING OF CYTOCHROME P450 OXYGENASES FROM TAXUS 
Previous studies have used DD-RT PGR to obtain cytochrome P450 taxoid osQrgenase 
clones firom methyl jasmonate-induced Taxus cells. The DD-RT PGR method is limited because it 
may to identify transcripts that are not highly induced by the inducing agent used in the method. 
This exan^le describes a strategy for cloning taxoid oxygenases, such as the disclosed taxoid 

35 5of-hydroxylase, that is not subject to the biases of DD-RT PGR (Udvardi et a/.. Plant Physiol.^ 

105:755-756, 1994; Holton and Lester, Methods Enzymol, 272:275-283, 1996; Pauli and Kutchan, 
Plant J, 13:793-801, 1998). This strategy is based upon two highly conserved regions of P450 
oxygenase proteins, the commonly occurring PERF motif and die region surrounding the invariant, 
heme-binding cysteine residue (von Wachenfeldt and Johnson, "^Structures of eukaryotic cytochrome 
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P4S0 enzymes,** In: Cytochrome P450: Structure, Mechanism, and Biochemistry, 2nd Ed.» PJL Ortiz 
de Montellano, ed.» New York; Plenum, pp. 183-223, 1995). 

Unless expressly stated otherwise, enzymes and reagents used in this and ofher Exanqples 
were obtained fiom Oibco/BRL (Grand Island, NY), Invitrogen (Carlsbad, CA), New England 
5 Biolabs (Beverly, MA) and Stratogene (La Jolla, CA), as indicated, and were used according to the 
respective manufacturer's instructions. Odier chemicals were purchased from Merck (Darmstadt, 
Germany) and Sigma (St Louis, MO). 

Degenerate and inosine-containing oligonucleotide forward primers directed to the PERF 
motif and its variant forms were designed based on amino acid sequence alignments of cytochrome 
10 P4S0 oxygenases of plant origin, which were available in the public databases. The following 
forward primers directed to die PERF motif were synthesized: 

5'-TTY MGI Ca AGM GIT TYG AR-3' (SEQ ED NO: 25) 

S'-TTY MGI ca TCI MGI TTY GAR-3' (SEQ ID NO: 26) 

5'-CKl m ca Ga CCR AAI GG-y (SEQ ID NO: 27) 
15 5'-GAR GARTTY MGN CCN GARMG-3' (SEQ ID NO: 28) 

5'^AR AAR TTY m Ca GAI ARG TTY (SEQ ID NO: 29) 

Using a similar strategy, degenerate and inosine-containing oligonucleotide reverse primers 
directed to the conserved heme-binding region were designed and synthesized. These reverse primers 
are: 

20 5'-GGR CAI m CKI m Ca CCl CCR AAI GG-3' (SEQ ID NO: 30) 

5'^CI GGR CAI ATI MKY YTI CCI GCI CCR AAI GG-3' (SEQ ID NO: 31). 
Amplification (Pauli and Kutchan, Plant J., 13:793-801, 1998), using first strand cDNA 
template derived from xnRNA isolated from 7. cuspidata cells 16 hours post-induction with mediyl 
jasmonate (Ketchum et a/., Biotechnol Bioengin., 62:97-105, 1999; Schoendorf a/., Proc. Natl 

25 Acad. Set USA, 98: 1501-1506, 2001), yielded amplicons of tfie predicted size (i\e., about 200 base 
pairs). The an9)licons were gel purified, ligated into pGEM-T™ ^romega, Madison, WI), and 
transformed into E. coli JM109 cells for plasmid preparation and insert sequencing. 

Based on the an^licon sequences, probes of 40 to 50 nucleotides in length were synthesized, 
5'-labeIed with [^^P]dCTP (ICN, Irvine, CA) using T4 polynucleotide kinase (New England Biolabs, 

30 Beverly, MA), and used to screen the previously described induced T. cuspidata X-ZAPII™ cDNA 
library (Schoendorf et a/., Proc. Natl Acad, Sou USA, 98:1501-1506, 2001) by en^loying 
Rapid-Hyb™ (Amersham Pharmacia, Piscataway, NJ) solution* Following 3 rounds of screening, 
32 positive plaques were in vivo excised as pBluescript n SK(-)™ phagemids in accordance wifli the 
manufiicturer*s (Stratagene) protocol, and partially sequenced using T3 and T7 promoter primers. 

35 Based on sequence information, clones that were previously obtained by the DD-RT PCR screen 
(Schoendorf a/., Proc. Natl Acad Set USA, 98:1501-1506, 2001) were set aside and not frirther 
examined. These new clones were obtained in ftiU-length form by Marathon™ 5'.RACE (Qontech, 
Palo Alto, CA), as necessary, and were fiilly sequenced. 
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One clone, designated SI, represented the most abundant cytoduome P4S0 cDNA isolated 
by this homology-based cloning approacL The SI clone was 1688 base pain in lengdi (GenBank 
accession no. AY289209) and contained an apparent ORF of 1509 base pairs encoding a predicted 
protein of 502 amino acids with deduced molecular weight of 56,859 Daltons. The deduced amino 
5 acid sequence of clone SI exhibited characteristics commonly known in the art to be typical of 
cytochrome P450 enzymes (von Wachenfeldt and Johnson, "Structures of eukaryotic cytochrome 
P450 enzymes," In: Cytochrome P450: Structure, Mechanism, and Biochemistry, 2nd Ed, P.R. Ortiz 
de Montellano, ed.. New York: Plenum, pp. 183-223, 1995), including the oxygen-binding domain 
(amino acid residues 270-285), an N-terminal membrane anchor (axnino acid residues 1-30), the 
10 highly conserved heme-binding motif (amino acid residues 433-441) with PFO element (amino acids 
at positions 437-439), and fhe absolutely conserved cysteine at position 445. 

Con^arisons of tiie clone SI deduced amino acid sequence widi the amino acid sequences 
of previously characterized cytochrome P450 taxoid hydroxylases, mcluding die taxoid 
lOP-hydroxylase (GenBank Accession No. AF31821 1), 13a-hydroxylase (GenBank Accession 
15 No. AY056019) and 14p-hydroxylase (GenBank Accession No. AYl 88177), revealed overall 
identities in the 61-63% range and similarities in the 79-81% range (FIG. 2). 

The sequence analyses described in this Bxan^le provide strong evidence that clone SI 
encodes a taxoid oxygenase. 

20 Example 2 

CYTOCHROME P450 cDNA EXPRESSION m YEAST 
This Exanf>le demonstrates one method for readily e^qiiressing taxoid oxygenases, such as 
taxoid 5QS-hydroxyla8e, in yeast 

For functional egression in Saccharomyces cerevisiae^ the deduced ORFs of clone SI was 
25 anq>lified by PGR using a gene-specific forward primer (containing the ATG start codon) and a 
corresponding reverse primer in which the stop codon was deleted to permit read-through when 
transferred to the expression vector, pYES2. W5-HIS-TOPO™ (Invitrogen) (see, e.g., SEQ ID 
NOs: 32 and 33). 

The clone SI ORF ampUcon was cloned into pYES2. W5-HIS-TOPO™ using standard 
30 techniques. Vector sequences in frame with the cloned SI ORF encode the simian V5 epitope and a 
histidine (Hise) tag. Thus, die resultant e3q)ression vector (referred to herein as pYES2. 1/S 1-V5-HIS) 
encodes a fusion protein containing the complete clone SI protein with a C-terminal simian 
V5 epitope and histidine (His^) tag. This tagging procedure allows detection of die caressed enzyme 
via immunoblot analysis of the microsomal protein preparation using commerciaUy available 
35 antibodies, and has been shown not to compromise the activity of other recombinant taxoid 
hydroxylases (Jennewein et al. Arch. Biochem. Biophys., 413:262-270, 2003). The 
pYES2.1/Sl-V5-HIS insert was sequenced using the Gall (forward) and V5 C-term (reverse) primers 
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(available from Invitiogen) to confinn diat expected SI ORF sequence was present and in tfie correct 
orientation for expression. 

The verified SI clone e?q;ms8ion vector was transformed into Saccharomyces cerevistae 
strain WATll using tlieUthium acetate method Qtoe/ a/., J. Sac/eWo^^^ 153:163-168, 1983). The 
S WATl 1 strain haibors a galactose-inducible NADPH-cytochrome P4S0 reductase from Arahidopsis 
thaliana^ which is required for efficient reductive coupling to die cytochrome (Ponq;>on et aL, 
Methods EnzymoL, 272:51-64, 1996). This yeast expression system ako permits testing of catalytic 
activity by in vivo feeding of taxoid substrates to the transformed yeast (Schoendorf 6/ a/., Proc. Nad. 
Acad. Sci, USA, 98:1501-1506, 2001), thereby eliminating the need for microsome isolation in 

10 preliminary functional screening assays (as discussed in more detail in Example 3). 

Transformed yeast cells were grown to stationary phase in 2 ml of SOIA medium at 30^C 
widi 250 rpm mixing. The cells were ttien harvested by centrifiigation(2000g, 10 minutes) and die 
ceUpeUet was suspended m 3 ml YPIAgakctose-containing induction mediuin. Approximately 
9 hours after induction, the cells were harvested again by centrifiigation. 

15 For immunoblotting, the cells were rcsuspended in lysis buffer (100 mM Tris HCl, pH 8.5, 

containing 1 mM DTT and 10% v/v glycerol), lysed by sonication (VirSonic, miciotip probe, 
medium setting, 3 x 30 sec, ViiTis Co., Gardiner, NY) or by use of a Bead Beater (Biospec Products, 
Bartiesville, OK), and the microsomes prepared (Ponton et al. Methods EnzymoL, 272, 51-64, 
1996). Protein (50 fig) was dien separated by SDS-PAGB (10% denaturing gel), transferred by wet 

20 transfer blottmg to nitrocellulose and immobilized by UV-ciosslinking. The blot was serially 

incubated widi mouse Penta-His-specific antibody (Qiagen, Valencia, CA) as primary antibody, and 
alkaline phosphatase-conjugated AfBniPure™ goat anti-mouse IgG (Jackson ImmunoResearch, West 
Grove, PA) as secondary antibo<fy for detection. The Qiagen protocols were used diroughout, with 
His-size markers as reference, and protein preparations from transformed cells harboring empty 

25 vector as negative controls. 

A single protein of approximately 57 kDa was specifically identified by Western blot. The 
observed molecular weight of this recombinant protein agrees with the calculated molecular weight 
of the deduced SI protein sequence (see Example 1). 

30 Example 3 

IN SITU SCREENING FOR CYTOCHROME P450 FUNCTION 
This Example demonstrates diat clone SI can be efBciently e?qiressed in yeast and that taxa- 
4(5),1 l(12)-diene and taxa-4(20),l l(12)-diene are exemplar high afBnity substrates for die clone SI 
oxygenase. 

35 Following confirmation of clone SI e^^ression by immunoblot analysis (see Exan^le 2), the 

activity of the recoinbinant cytochrome P450 enzyme was demonstrated by in vivo feeding as 
previously described by Schoendorf al (Proc, Natl Acad. Sci. USA, 98:1501-1506, 2001). This 
in vivo feeding protocol eliminated the uncertainties associated with microsome isolation and in vitro 
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assay, including the instability of P4S0 oxygenases in yeast membranes (Scboendorf e/ a/., Proc. 
Natl Acad. Set. USA, 98:1501-1506, 2001). 

Transformed and induced yeast cells were harvested by centrifiigation as described in 
Example 2. As a negative control for ttie feeding eiqperiments, the yeast host ivas transformed with 
5 the pYES2. W5-HIS-TOPO™ vector containing a j3-glucuronidase insert instead of the cytochrome 
P450 SI clone. The cells were lesuspended in 3 ml of fresh YPLA medium to which 30 |iM of the 
labeled test substrate was added. Test substrates included: 



Substrate (Activity) 


Activity 
(Ci/moD 


Reference 


(±H20-^taxa-4(S),l l(12>diene 


5.3 


Rubenstein et al., J. Label. Compds. 
Radiopharm,, 43:481-491. 2000 


(±>[20-'H]taxa-4(20), 1 l(12).dieiie 


2.6 


Rubenstein et al.^ J. Label Compds. 
Radiopharm.. 43:481-491. 2000 


(db).taxa-4(20),l l(12)-diea-5a-ol 


2.0 


Rubenstein et al, J. Label Compds. 
Radiopharm., 43:481-491. 2000 


(±)-taxa-4(20),ll(12)^eii-5a-yI acetate 


2.0 


Walker et aly Arch. Biochem. Biophys., 
364:273-279, 1999; Lovy Wheeler et al. 
Arch. Biochem. Biophys., 390:265-278, 
2001 


(±)-taxa-4(20),I l(12)-dien-5a-acetoxy- 
lOp-ol 


2.0 


Jennewein et aly Proc. Natl Acad. ScL 
USA, 98:13595-13600, 2001; Jennewein et 
al. Arch. Biochem. Biophys., 413:262-270, 
2003 


(db>taxa-4(20Xl l(12)-dieii-5a,13a^ol 


2.0 


Jennewein et aly Proc. Natl Acad. ScL 
USAy 98:13595-13600, 2001; Jennewein et 
aly Arch. Biochem. Biophys., 413:262-270, 
2003 


(•fH^-ace^qtaxusin' 


10.0 


De Case etal, Chem. Commun.y 
1282-1294, 1969; Chau et aly Chem. Biol, 
1 1:663-672, 2004; Chau and Croteau, Arch. 
Biochem. Biophys., 427:48-57. 2004 



*Tetraacetateoftaxa-4(20),ll(12).dien-5a,9a,10P,13a-te1raol 



10 The cell and test substrate suspension was incubated overnight at 30®C with mixing 

(250 rpm). The incubation mixture was then treated for 15 minutes in a sonication batii and extracted 
twice with 3 ml of hexane:ethyl acetate (4:1 v/v). The organic extract was then dried under N2, the 
residue dissolved in 100 ^1 of acetonitrile, and an aliquot was separated by reversed-phase 
radio-HPLC (250 mm x 4.6 mm column of Alltech (Deerfield, IL) Bconosil Cis (5 jim); flow rate of 

15 1 nd/min; wifli radio-detection of the effluent (Flow-One-Beta Series A-1000, Radiomatic Corp., 
Meriden, CT)). The following gradient was enq)loyed: 0-5 minutes at 100% Solvent A (97.99% 
H2O with 2% CH3CN and 0.01% H3PO4 (v/v)), 5-15 minutes at 0-50% Solvent B (99.99% CH3CN 
with 0.01% H3PO4 (v/v)), 15-55 minutes at 50-100% Solvent B, 55-65 minutes at 100% Solvent B, 
65-70 minutes at 0-100% Solvent A, 70-75 minutes at 100% Solvent A. The HPLC eluant was 

20 collected in 1 minute fractions and the appropriate fractions containing the radiolabeled product were 
combined, dried under a stream of N2, and dissolved in the TniTtinmnn volume of benzene for GC-MS 
analysis. 
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GOMS analyses were perfonned on a Hewlett-Packard 6890 QOMSD system using a ZB-S 
capillary column (Phenomenex (Torrance, CA); 30 m length; 0.25 mm inner diameter; coated with a 
0.25 \xm film of phenyl (5%) poiysiloxane). Cool on-colunm injection was used, with He flow rate of 
0.7ml/minandatenqperatureprogramtem40®Cto320''Cat20^C/min. Spectra were recorded 
5 at70eV. 

Radio-HPLC analysis showed that the two taxadiene isomers were most efficiently (almost 
quantitatively in the case of the 4(5),1 l(12)-isomer) converted to more polar products. In the case of 
taxa-4(5),l l(12)-diene as substrate, the principal biosynthetic product (>92%) elutcd with a retention 
time identical to diat of taxa-4(20),l l(12)-dien-5a-ol (Heftier ei al, Chem. BioL, 3:479-489, 1996; 

10 Rubenstein ei al, J. Label Compds. Radiopharm., 43:481-491, 2000) and the minor product (<S%) 
eluted with a retention time consistent widi that of a taxadien-diol. OC-MS analysis (electron iiqpact 
ionization) confirmed tiie major product to possess a retention time and mass spectrum identical to 
tiiat of authentic taxa-4(20),l l(12)-dienr5a-ol (Hefitier et a/., Chenu Biol, 3:479-489, 1996) with 
characteristic ions at miz 288 (P+), 273 (P+-CH3), 270 (P-KH2O) and 255 (P+.H2O-CH3). The minor, 

15 more polar product yielded a mass spectrum consistent with tiiat of a taxadien-diol (ions 

corresponding to the loss of a metiiyl and two molecules of water fiom an unobserved parent ion of 
m/z304). 

In the case of taxa-4(20),l l(12>.diene as substrate, the major product (-90%) was agam 
shown, upon radio-HPLC analysis, to possess a retention time identical to that of taxa- 

20 4(20),1 l(12)-dien-5a-ol, and tiiis identification was confirmed as before by GC-MS analysis. The 
taxadien-diol side product was abo observed ('^8%), as were a range of other minor metabolites 
(at ^2Vo of the product mix) that were also derived firom tins substrate in the negative control (yeast 
fliat e3q>ressed P-ghicuronidase). These negative controls did not produce taxa-4(20),i l(12)-dien- 
5a-ol or the taxadien-diol fsom eitiier taxadiene isomer. 

25 A. 13a-Hydrozylase Utilizes SotF-Hydroxylase Product 

The order of oxygenation reactions on the taxane (taxadiene) nucleus en route to paclitaxel 
is not precisely known. However, based on con^)arison of the structures of tiie several hundred 
naturally occurring taxanes (Kingston et al. The Taxane Diterpenoids, in Herz et al (eds.). Progress 
in the Chemistry of Organic Natural Products, Springer-Verlag, New York, Vol. 61, p. 206, 1993; 

30 and Baloglu et al, 7. Nat Prod, 62: 1448-1472, 1999), it can be deduced from relative abundances of 
taxoids wifli oxygen substitution at each position (Floss et al. Biosynthesis ofTaxol, in Suffiiess 
(ed,), Taxoh Science and Applications, CRC Press, Boca Raton, FL, pp. 191-208, 1995) tiwt oxygens 
at C5 (carbon numbers shown m Section I, *Taxoid") and CIO are introduced early, followed by 
oxygenation at C2, C9 and C13. Oxygenations at C7 and CI of the taxane nucleus are considered to 

35 be very late introductions, possibly occurring after oxetane ring formation; however, epoxidation (at 
C4/C20) and oxetane formation seemingly must precede oxidation of the 09 hydroxyl to a carbonyl 
(Floss et al. Biosynthesis ofTaxol, in Suf&iess (ed.), Taxol: Science and Applications, CRC Press, 
Boca Raton, FL, pp. 191-208, 1995). 
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The taxa-4(20Xl l(12)-dieii-Sa-ol radiolabeled product of the clone SI enzyme was isolated 
by HPLQ and die purified material was fed to yeast that ftmctionally expxess the previously 
characterized taxoid ISa-hydroxylase (Jennewein et al., Proc. Natl Acad. Set USA, 
98:13593-13600, 2001). As discussed above, 13tt-hydroxylation of a paclitaxel intermediate is 
5 believed to follow Stt-hydroxylation in the paclitaxel biosynthetic pathway (Floss and Mocek, Taxol: 
Science and Applications, CRC Press, Boca Raton, 191-208, 1995; and Croteau et aL, Curr. Top. 
Plant Physiol 15:94-104, 1996). Thus, as expected, taxa-4(20), 11 (12)-dien-5a-ol was quantitatively 
converted to taxa-4(20),l l(12)-dicn-5ct,13a-diol by the 13a-hydroxylase. 

This Example demonstrates that cytochrome P450 clone SI encodes a taxoid 
10 5a-hydroxylase, which catalyzes, at least, the first oxygenation step of the paclitaxel biosynthetic 
pathway. 

Example 4 

SimSTILiTE BIfWmG AND KINETIC ANALYSIS OF 

15 RECOMBINANT 5C6-HYDROXYLASE 

This Example demonstrates that die clone SI hydroxylase binds, at least; taxa- 
4(20),1 l(12)-diene and taxa-4(5),l l(12)-diene with high afBnity, and efficiently catalyzes both 
taxadiene isomers to the corresponding taxadien-5a-ol. 

To prepare sufficient oxygenase enzyme for comparative analysis of substrate binding and 

20 kinetic phenomena, in a host less prone to interfering activity and arti&ct formation, the taxadiene 
5a-hydroxylase SI cDNA clone was transferred to the previously described hzovlowmjiS-Spodoptera 
fii^perda (Sf9) expression system (which also coexpresses a Taxus cytochrome P450 reductase) 
(Jennewein et al., Proc. Natl Acad. Set USA, 98:13595-13600, 2001). 

For construction of the recombinant baculovixus harboring cytochrome P450 clone SI, die 

25 SI ORF was amplified using Pfii DNA polymerase and gene-specific primers containing a BamEU 
site immediately upstream of the start codon and another containii^g a NotI site downstream of die 
stop codon. The gel purified SI amplicon was subcloned first into the pCR-Blunt™ vector 
(Invitrogen) and the insert was then excised using the BamHI/NotI restriction sites and ligated into 
the similarly digested pFastBacl™ vector (Life Technologies, Grand Island, NY). This 

30 SI pFastBacl™ construct was tiien used to prepare recombinant Bacmid DNA by transforming 
Escherichia coli strain DHlOBac (Life Technologies) carrying the baciilovirus genome. As a 
negative control for this expression system, recombinant baculovirus containing a P-glucuronidase 
gene, instead of the cytochrome P450 SI ORF, was used. Baculovirus construction and transfection 
of S9 cells were carried out according to the Life Technologies protocols, and culturing was 

35 performed as previously described (Jennewein et a/., Proc. Natl. Acad. Set USA, 98:13595-13600, 
2001). 

For microsome preparation, Sf9 cells were harvested three days after transfection, washed 
twice with 50 mM KHJPO4, pH 7.5, containing 9% (w/v) NaCl, twice with 50 mM Hepes, pH 7.5, 
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containing 0*5 mM BDTA» 0. 1 mM DTT and 10% (v/v) glycerol, and then lysed by gentle sonication 
as before in SO ml of the Hcpes buffer system. Cell debris was removed by centrifligation (lO.OOOg, 
20 minutes, A^'Q, and the resulting 8tq>ematant was tben centrifiiged at 28»000g (20 minutes, 4'*C) 
and dien at 10S,00Og (120 minutes, 4<>Q to provide the microsomal membranes which were 
S resuspended in the same Hepes buffer system without EDTA, or other buffer system as noted herein. 
Protein content was determined by the Bradford method (Bradford, Anal. Biochem., 72:248-254, 
1976) using bovine seium albumin as standard. 

The CO-difference spectra of microsomes isolated from Sf? insect cells expressing either the 
recombinant So^hydroxylase or p-glucuronidase gene were obtained as described by Omar and Sato 
10 (J. Biol Oiem. 239, 2370-2378, 1964) using a Perkdn-Ehner Lambda 18 spectrophotometer 

(Haudenschild et aL, Arch. Biochem. Biophys., 379:127-136, 2000). Based on CO-difference spectra, 
more dian 300 pmol cytochrome P450/mg microsomal protein was routinely produced by fliis Sf9 
msect cell system. 

Binding spectra for bodi taxadiene isomers (m the absence of NADPH) were then recorded 

IS using die Sf9 cell microsomes enriched m the recombinant 5c0-hydroxylase. Substrate binding 
spectra were obtained as described by Schcnkman and Jansson, Methods Mol Biol (Cytochrome 
P450 Protocols), 107:25-33, 1998) using a Perkin-Ehner Lambda 18 spectrophotometer 
(Haudenschild et al.. Arch. Biochem. Biophys., 379: 127-136, 2000). Substrate binding spectra were 
recorded with up to 200 pmol of recombinant microsomal cytochrome P450 enzyme (as determined 

20 by co-difference spectral analysis) per cuvette in 100 mM sodium phosphate buffer at pH 7.5. In 
preparation for binding studies, die taxadiene isomera were each dissolved in DMSO and 1 fil 
additions to the sample were made to a final concentration of 1.5% (v/v). For data analysis. Spectrum 
for Windows (Perkin-Ehner Corp., Wellesey, MA) and Sigm^lot 7.0 (SPSS Inc., Chicago, BL) vme 
en^loyed and experiments were run in triplicate. 

25 Evaluation of the substrate binding constant (Ks) over a 100-fold range of substrate 

concentrations showed Ks to vary somewhat from 3 to 5 for taxa-4(20),l l(12)-diene and from 
5 to 8 jiM for taxa-4(5),l l(12)-diene (a typical data set at 200 pmol protein concentration is 
illustrated in FIG. 3). These results indicate that the taxadiene 5a-hydroxylase active site binds, at 
least, both positional isomers of the olefin substrate with high a£Qnity. 

30 Kinetic constants for bofli isomera were next evaluated (at a saturating 200 

concentration of NADPH plus regenerating system; Shimada and Yamazaki, Meth. Mol Biol, 
107:85-93, 1998). The isolated microsomes were resuspended in 50 niM Hepes, pH 7.5, containing 1 
mM DTT and 5% (v/v) glycerol, and the 1 ml reactions (-600 ^g protein, 50 |iM substrate dissolved 
in DMSO, and flie requisite co&ctora (e.g., NADPH plus regenerating system) were run as described 

35 previously, with the identical protocols for product analysis (Jennewein et al, Proc. Natl Acad. ScL 
USA, 98:13595-13600, 2001). DMSO was without influence on the reaction. For kinetic evaluation, 
following the establishment of linear reaction conditions in protein concentration and time, the 
response to substrate concentration was plotted by the Michaelis-Menten method (Sigmaplot 7.0) 
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using the calOixBted radio-HPLC protocol for product deteimination. Data fiom three independent 
experiments were pooled and the line of best fit taken (R^ >0.99). 

Plottmg the lines of best fit (R^ > 0.99) provided a Km value of 16 3.2 ^M, with Vrel 
of 120, for taxa^20Xl l(12)-diene, and a Km vahie of 24 ± 2.5 |aM, with Vrel of 100, for taxa* 
S 4(5X1 l(12>diene (see HQ. 4); the latter Km value conqpares to a Km value of «-6 pM determined 
previously for the 4(5)»1 l(12)-isomer with the native microsome preparations from yew stem tissue 
(He&er et al, Chem. Biol. 3:479-489, 1996), Comparison of catalytic cfiBciencies (Vrel/Km) 
indicates that bofli taxa-4(20),l l(12)-diene and flie 4(5).l l(12)-isoiner are efficiently catalyzed by die 
recombinant Soe>hydroxylase enzyme. 

10 The taxadien-diol product, which was observed in the intact yeast system fed the taxadiene 

substrates (see Example 3), was not observed in the hticoloyms-^podoptera system. Thus, it is 
believed that flie diol product results from the action of yeast host enzyme(8) upon the taxadienol 
produced by the recombinant 5a-hydro^lase; Htds observation was independently verified by feeding 
control yeast cells the taxadienol product 

1^ This Exanq>le and Exan^le 3 demonstrate tfiat the S(x-hydroxylation is a slow step of 

paclitaxel biosynthesis relative to the downstream oxygenations and acylations. Embodiments of the 
disclosed oxygenases catalyze 5oe-hydroxylation of several taxoids, including, for example, the 
natural paclitaxel intermediate, taxa-4(5),l l(12)-diene. Thus, recombinant expression of die 
disclosed oxygenase, for example, in Taxus plants and cells will increase padiway flux toward 

20 paclitaxel to in^rove production yields of this drug from its natural, and currently the only 
commercially viable, source. 

Examples 

SUBSTRATE UTILIZATION BY TAXUS MICROSOMES 
25 Examples 3 and 4 demonstrate that, at least, two taxadiene isomers are fimctional substrates 

of die recombinant clone SI 5a-hydroxylase. Taxus cell microsomes contain a structurally 
uncharacterized Soshydroxylase activity (Heftier etal., Chem. Biol, 3:479-489, 1996), which had not 
been tested previously with the 4(20),1 l(12)-diene isomer (Heftier et al, Chem. Biol, 3:479-489, 
1996). This Exaiiq)le demonstrates that a crude Taxus microsome preparation converts bodi taxa- 
30 4(5),1 l(12)-diene and taxa-4(20),l l(12)-diene to die corresponding taxidienols. 

Preparation of Taxus su^ension cell microsomes and assays for microsomal 
5a.hydroxylase activity were carried out as previously described (Heftier et a/., Chem. Biol, 
3:479-489, 1996; Lovy Wheeler et al. Arch. Biochem. Biophys., 390:265-278, 2001) with die 
followmg modifications: Unelicited Taxus media hicksii cells were harvested 14 days after transfer, 
35 separated from the media, frozen in liquid N2 and ground to a fine powder with a mortar and pestle, 
with extraction and microsome preparation as described by Lovy Wheeler et al (Arch, Biochem. 
Biophys., 390:265-278, 2001). The previously described radio-HPLC-based assay (Lovy Wheeler et 
al. Arch. Biochem. Biophys., 390:265-278, 2001) was einployed to separate die substrate from 
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taxadien-Sa-ol and polyok derived therefiom which were summed as 'total producT for die purpose 
of rate determiiuition. 

Following the confirmation of linear reaction conditions in protein concentration and Hmg^ 
kinetic constants were determined for botii [20-^taxa-4(S)4 l(12)-diene and [20-^taxa- 
5 4(20),1 l(12>dicne wifli die optimized assay (Heftier et al, Chem. Biol, 3:479^89, 1996). The 
radio-HPLObased assay previously described by Lovy Wheeler et al (Arch. Biochem. Biophys., 
390:265-278, 2001) was en^loyed to permit summing of taxadien-polyols derived subsequentiy from 
the initially formed taxadienol product generated by this microsomal system tiiat contains all of the 
downstream cytochrome P450 taxoid oxygenases of flie patiiway (Lovy Wheeler et al.. Arch. 

10 Biochem. Biophys., 390:263-278, 2001). Any kinetic isotope effect (KIE) resulting from die C20 
deprotonation of [2a.^taxa-4(5),l l(12)HJiene was not considered because previous studies witii 
[20-^H3]taxa-4(5), 1 l(12)-diene (^ atom % ^ indicated tiiat hydrogen removal from C20 is not 
rate limiting in die overall hydroxylation reaction (Heftier ei a/., Chem. Biol. 3:479-489, 1996). 

By tiiis approach, Michaelis-Menten plotting (R^ > 0.98 for the lines of best fit) revealed a 

15 Km value of 48 nM, and Vrel of 100, for taxa.4(5),l l(12)-diene, and a Km value of 27 jiM, wifli 
Vrel of 150, for taxa-4(20),l l(12)-diene (see FIG. 4). Thus, one or more constituents of the crude 
Taxus microsomal protein preparation catalyze the reaction of taxa-4(5),l l(12)-diene and taxa- 
4(20),1 l(12)-diene to the corresponding taxadienols. 

Native and recombinant Taxus taxadiene synthase, which is believed to be the first enzyme 

20 in die paclitaxel biosyntiietic pathway, each produce principally taxa-4(5),l l(12)-diene (94%), with 
very low level co-production of taxa-4(20),ll(12)-diene (4.8%) and verticiBene (1.2%), and only 
trace amounts of taxa-3(4),ll(12>-diene (Williams etal.,Arch. Biochem. Biophys.. 379:137-146, 
2000). For this reason, taxa-4(5),l l(12)-diene is believed to be die natural substrate of the 
mediator(s) of the putative next step in the pathway, namely, 5c^hydroxylation. This Exanq)le 

25 demonstrates that the 5o^hydroxylase activity present in Taxus microsomes efGcientiy utilized two 
taxadiene substrates with the catalytic efficiency (Vrel/Km) of the presumed unnatural substrate, 
taxa-4(20),l l(12)-diene, being slightiy higher than the presumed natural substrate, taxa- 
4(5),ll(12)-diene. 

30 Example 6 

IN VIVO SUBSTRATE OF 5<^HYDROXYLASE 
This Example demonstrates that the relaxed substrate specificity of clone 81 oxygenase 
extends to bodi naturally occurring substrates and non-naturally occurring substrates. 

As discussed in preceding Exan^les, recombinant 5c^hydroxylase enzyme (clone SI) and 
35 5Qf-hydroxylase microsomal activity have relaxed substrate specificity and, for example, efficientiy 
utilize taxa-4(20),l l(12)-diene as a substrate. Taxus taxadiene synthase (native and reconibinant 
enzyme, and allelic variants (Accession No. AY364469 and Accession No. AY364470)) produces 
very low levels of taxa-4(20),l l(12)-diene (4.8%) (Williams et al. Arch, Biochem, Biophys., 
379:137-146, 2000). NoneOieless, taxa-4(20),l l(12)-diene could be a productive intermediate in ^nvo 
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if Taxus cells expressed a taxadiene isomeiase that catalyzed the conversion of taxa- 
4(20).l l(12)-diene to taxa-4(20),l l(12)-4iene. 

Recombinant taxadiene synthase isofimns and taxadiene synthase allelic variants were 
expressed in E. colL The preparation and assay of the recombinant taxadiene synthase isoforms were 
5 conducted by established methods using capillary GC-MS conditions designed to separate taxadiene 
positional isomers (Williams et a/., Arch. Biochem. Biophys., 379:137-146, 2000). 

The assay for microsomal taxa-4(5),l l(12).dicne isomerase activity (and the reverse 
isomerization) was carried out under standard cytochrome P450 oxygenase conditions but in the 
absence of NADPH or O2, or in the presence of inhibitory concentrations of CO, miconazole or 

10 clotrimazole (imder conditions described previously for which the rate of Sa-hydroxylation is 

negligble (Heftaer et a/., Otem. BioL, 3:479-489, 1996)). A number of additional, potential coftctors 
were also tested, includmg FAD, FADH2, FMN, FMNH2, NAD+, NADH and NADP+ (aU at 
2.5 xsM), as well as Mgdz (at 5.0 mM). The possibility of pH-dependent isomerization was tested 
by incubating each isomer (100 pM) in phosphate buffer (pH 4 to 10) for 12 hours at 31«*C, widi 

15 separation of isomers as described previously (Williams et al. Arch. Biochem. Biophys., 
• 379:137-146,2000). 

No isomerization of taxa.4(5),l l(12).diene to the 4(20),1 l(12)-diene isomer (or vice versa) 
was observed in Taxus cell microsomes (or Spodoptera microsomes enriched in the reconibinant 
clone SI 5a-hydroxylase) under standard assay conditions but in the absence of NADPH or in flie 

20 absence of O2 (N2 atmosphere plus and O2 scavenging system), or in the presence of CO, 100 

miconazole, or 100 nM clotrimazole (all conditions under which hydroxylation activity is negligible), 
nor was isomerization observed in boiled controls containing all cofactors and reactants. Similarly, 
no interconversion of eidier positional isomer was observed in the presence of magnesium ion, 
NAD+, NADH or NADP+, or flavin co&ctois, at pH vahies ranging firom 4 to 1 0. 

25 This Example indicates that taxa-4(5),l l(12)-diene is not appreciably isomerized to taxa- 

4(20), 1 1(1 2)-diene under physiological conditions. The migration of die double bond firom the 
4(5)- to the 4(20)-position in the process of taxadienol formation may, but need not, be an inherent 
feature of the cytochrome P450 oxygenase reaction with taxa-4(5),l l(12)-diene as substrate. 

This Exanq>le demonstrates that taxa-4(20),l l(12)-diene is an adventitious, yet efficient, 

30 substrate for the oxygenase encoded by clone S 1 . 

Example 7 

PROPOSED So^HYDROXYLASE MECHANISM OF ACTION 
Previous efforts to evaluate die 5a-hydroxylation reaction by the Taxus microsomal activity, 
^. 35 by search for an epoxide intermediate and through flie use of [20-^3]taxa.4(5),l l(2>diene to 

examine a KIE on the deprotonation step, did not elucidate a possible mechanism of action. Two 
possible mechanisms include, for example, (i) a preliminary conversion of the 4(5)-double bond of 
taxa-4(5),l l(12)-diene to the corresponding 4(5)-epoxide, followed by ring opening and elimination 
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of a proton from the C20 metliyl group to yield the allylic alcohol product, or (ii) cytochrome 
P4S0^inediated abstraction of hydrogen from Ihe C20 methyl of the substrate to yield the allylic 
radical to which oxygen is added at CS (HQ. S) (Hefiier et al. Ch&n. Biol. 3:479-489, 1996). 

Though not bound by any particular mechanism of action, the utilization of the isomeric 

5 taxa-4(20),l l(12)-diene by the recombinant hydroxylase, with efiBciency comparable to that of the 
putative natural substrate (/.ei, taxa-4(5),l l(12)-diene)> suggests a mechanism involving abstraction 
of a hydrogen radical from C20 (or C5 in the case of the other isomer)^ leading to the delocalized 
allylic radical, followed by oxygen insertion selectively from the 5a-face of this radical intermediate 
to accoxxxplish the rearrangement Perhaps the somewhat tighter binding of the 4(20)-isomer is a 

10 reflection of the ability of this isomer to more closely mimic the allylic radical intermediate. 

Embodiments of this disclosure provide taxoid oxygenase proteins and nucleic acid 
molecules, and meOiods of isolating, making, and using tihese molecules. Specific embodiments 
relate to taxoid 5-hydroxylase proteins and nucleic acid molecules, including, for example, 
1 S 5a-hydroxylase proteins and nucleic acid molecules. Further eoobodiments provide methods for 
producing paclitaxel, or its intermediates and, in particular, to methods of hydroxylating taxoids at 
position 5. It will be apparent that the precise details of the con^ositions and methods described may 
be varied or modified without departing from die spirit of this disclosure. We claim all such 
modifications and variations that fiill within the scope and spirit of die claims below. 



